# News aggregator

### Dominic Orchard: Rearranging equations using a zipper

Whilst experimenting with some ideas for a project, I realised I needed a quick piece of code to rearrange equations (defined in terms of +, *, -, and /) in AST form, e.g., given an AST for the equation x = y + 3, rearrange to get y = x - 3.

I realised that equations can be formulated as *zippers* over an AST, where operations for navigating the zipper essentially rearrange the equation. I thought this was quite neat, so I thought I would show the technique here. The code is in simple Haskell.

I’ll show the construction for a simple arithmetic calculus with the following AST data type of terms:

data Term = Add Term Term | Mul Term Term | Div Term Term | Sub Term Term | Neg Term | Var String | Const Integerwith some standard pretty printing code:

instance Show Term where show (Add t1 t2) = (show' t1) ++ " + " ++ (show' t2) show (Mul t1 t2) = (show' t1) ++ " * " ++ (show' t2) show (Sub t1 t2) = (show' t1) ++ " - " ++ (show' t2) show (Div t1 t2) = (show' t1) ++ " / " ++ (show' t2) show (Neg t) = "-" ++ (show' t) show (Var v) = v show (Const n) = show nwhere show' is a helper to minimise brackets e.g. pretty printing “-(v)” as “-v”.

show' :: Term -> String show' (Var v) = v show' (Const n) = show n show' t@(Neg (Var v)) = show t show' t@(Neg (Const n)) = show t show' t = "(" ++ show t ++ ")"Equations can be defined as pairs of terms, i.e., ‘T1 = T2′ where T1 and T2 are both represented by values of Term. However, instead, I’m going to represent equations via a zipper.

Zippers (described beautifully in the paper by Huet) represent values that have some subvalue “in focus”. The position of the focus can then be shifted through the value, refocussing on different parts. This is encoded by pairing a focal subvalue with a *path* to this focus, which records the rest of the value that is not in focus. For equations, the zipper type pairs a focus Term (which we’ll think of as the left-hand side of the equation) with a path (which we’ll think of as the right-hand side of the equation).

Paths give a sequence of direction markers, essentially providing an address to the term in focus, starting from the root, where each marker is accompanied with the label of the parent node and the subtree of the branch not taken, i.e., a path going left is paired with the right subtree (which is not on the path to the focus).

data Path = Top (Either Integer String) -- At top: constant or variable | Bin Op -- OR in a binary operation Op, Dir -- in either left (L) or right (R) branch Term -- with the untaken branch Path -- and the rest of the equation | N Path -- OR in the unary negation operation data Dir = L | R data Op = A | M | D | S | So | DoThe Op type gives tags for every operation, as well as additional tags So and Do which represent sub and divide but with arguments flipped. This is used to get an isomorphism between the operations that zip “up” and “down” the equation zipper, refocussing on subterms.

A useful helper maps tags to their operations:

opToTerm :: Op -> (Term -> Term -> Term) opToTerm A = Add opToTerm M = Mul opToTerm D = Div opToTerm S = Sub opToTerm So = (\x -> \y -> Sub y x) opToTerm Do = (\x -> \y -> Div y x)Equations are pretty printed as follows:

instance Show Path where show p = show . pathToTerm $ p instance Show Equation where show (Eq t p) = (show t) ++ " = " ++ (show p)where pathToTerm converts paths to terms:

pathToTerm :: Path -> Term pathToTerm (Top (Left c)) = Const c pathToTerm (Top (Right v))= Var v pathToTerm (Bin op L t p) = (opToTerm op) (pathToTerm p) t pathToTerm (Bin op R t p) = (opToTerm op) t (pathToTerm p) pathToTerm (N p) = Neg (pathToTerm p)Now onto the zipper operations which providing rebalancing of the equation. Equations are zipped-down by left and right, which for a binary operation focus on either the left or right argument respectively, for unary negation focus on the single argument, and for constants or variables does nothing. When going left or right, the equations are rebalanced with their inverse arithmetic operations (show in the comments here):

left (Eq (Var s) p) = Eq (Var s) p left (Eq (Const n) p) = Eq (Const n) p left (Eq (Add t1 t2) p) = Eq t1 (Bin S L t2 p) -- t1 + t2 = p -> t1 = p - t2 left (Eq (Mul t1 t2) p) = Eq t1 (Bin D L t2 p) -- t1 * t2 = p -> t1 = p / t2 left (Eq (Div t1 t2) p) = Eq t1 (Bin M L t2 p) -- t1 / t2 = p -> t1 = p * t2 left (Eq (Sub t1 t2) p) = Eq t1 (Bin A L t2 p) -- t1 - t2 = p -> t1 = p + t2 left (Eq (Neg t) p) = Eq t (N p) -- -t = p -> t = -p right (Eq (Var s) p) = Eq (Var s) p right (Eq (Const n) p) = Eq (Const n) p right (Eq (Add t1 t2) p) = Eq t2 (Bin So R t1 p) -- t1 + t2 = p -> t2 = p - t1 right (Eq (Mul t1 t2) p) = Eq t2 (Bin Do R t1 p) -- t1 * t2 = p -> t2 = p / t1 right (Eq (Div t1 t2) p) = Eq t2 (Bin D R t1 p) -- t1 / t2 = p -> t2 = t1 / p right (Eq (Sub t1 t2) p) = Eq t2 (Bin S R t1 p) -- t1 - t2 = p -> t2 = t1 - p right (Eq (Neg t) p) = Eq t (N p)In both left and right, Add and Mul become subtraction and dividing, but in right in order for the the zipping-up operation to be the inverse, subtraction and division are represented using the flipped So and Do markers.

Equations are zipped-up by up, which unrolls one step of the path and reforms the term on the left-hand side from that on the right. This is the inverse of left and right:

up (Eq t1 (Top a)) = Eq t1 (Top a) up (Eq t1 (Bin A L t2 p)) = Eq (Sub t1 t2) p -- t1 = t2 + p -> t1 - t2 = p up (Eq t1 (Bin M L t2 p)) = Eq (Div t1 t2) p -- t1 = t2 * p -> t1 / t2 = p up (Eq t1 (Bin D L t2 p)) = Eq (Mul t1 t2) p -- t1 = p / t2 -> t1 * t2 = p up (Eq t1 (Bin S L t2 p)) = Eq (Add t1 t2) p -- t1 = p - t2 -> t1 + t2 = p up (Eq t1 (Bin So R t2 p)) = Eq (Add t2 t1) p -- t1 = p - t2 -> t2 + t1 = p up (Eq t1 (Bin Do R t2 p)) = Eq (Mul t2 t1) p -- t1 = p / t2 -> t2 * t1 = p up (Eq t1 (Bin D R t2 p)) = Eq (Div t2 t1) p -- t1 = t2 / p -> t2 / t1 = p up (Eq t1 (Bin S R t2 p)) = Eq (Sub t2 t1) p -- t1 = t2 - p -> t2 - t1 = p up (Eq t1 (N p)) = Eq (Neg t1) p -- t1 = -p -> -t1 = pAnd that’s it! Here is an example of its use from GHCi.

foo = Eq (Sub (Mul (Add (Var "x") (Var "y")) (Add (Var "x") (Const 1))) (Const 1)) (Top (Left 0)) *Main> foo ((x + y) * (x + 1)) - 1 = 0 *Main> left $ foo (x + y) * (x + 1) = 0 + 1 *Main> right . left $ foo x + 1 = (0 + 1) / (x + y) *Main> left . right . left $ foo x = ((0 + 1) / (x + y)) - 1 *Main> up . left . right . left $ foo x + 1 = (0 + 1) / (x + y) *Main> up . up . left . right . left $ foo (x + y) * (x + 1) = 0 + 1 *Main> up . up . up . left . right . left $ foo ((x + y) * (x + 1)) - 1 = 0It is straightforward to prove that: up . left $ x = x (when left x is not equal to x) and up . right $ x = x(when right x is not equal to x).

Note, I am simply rebalancing the syntax of equations: this technique does not help if you have multiple uses of a variable and you want to solve the question for a particular variable, e.g. y = x + 1/(3x), or quadratics.

Here’s a concluding thought. The navigation operations left, right, and up essentially apply the inverse of the operation in focus to each side of the equation. We could therefore reformulate the navigation operations in terms of any *group*: given a term L ⊕ R under focus where ⊕ is the binary operation of a group with inverse operation -1, then navigating left applies ⊕ R-1 to both sides and navigating right applies ⊕ L-1. However, in this blog post there is a slight difference: navigating applies the inverse to both sides and then reduces the term of the left-hand side using the group axioms X ⊕ X-1 = I (where I is the identity element of the group) and X ⊕ I = X such that the term does not grow larger and larger with inverses.

I wonder if there are other applications, which have a group structure (or number of interacting groups), for which the above zipper approach would be useful?

### Douglas M. Auclair (geophf): 1HaskellADay: Up, up, and away!

### How to memoize a function that receives a tree?

How to properly memoize su? Considering checking for the existence of a Tree in a hash is O(n), it will not work as well as expected for **huge** trees. I've thought in using this idea to make equality O(1), but I'm not sure the linked library on the answer is supposed to be used for that. How would you do this?

[link] [3 comments]

### Dominic Orchard: The Four Rs of Programming Language Design (revisited)

Following my blog post last year about the “four Rs of programming language design” I wrote a short essay expanding upon the idea which has now been included in the online post-proceedings of the ACM Onward ’11 essay track (part of the SPLASH conference).

The essay was a lot of fun to write (I somehow end up referencing books from the 19th century, a sci-fi novel, 1984, and The Matrix!) and is a kind of mission statement (at least for myself) for language design; it is available on my research page here. I hope that it provides some food-for-thought for others interested in, or working in, language design.

### ERDI Gergo: Arrow's place in the Applicative/Monad hierarchy

I've been thinking lately about arrows in relation to applicative functors and monads. The difference between the latter two is easy to intuit (and I'll describe it via an example below), but I never managed to get the same level of understanding about arrows. There's a somewhat famous paper about this question, which has a very clear-cut diagram showing that applicatives embed into arrows and arrows embed into monads (and both containments are non-trivial), which I understood as meaning every arrow is an applicative functor, and every monad is an arrow.

At first glance, this makes sense, given the well-known result that monads are exactly equivalent to arrows that are also instances of ArrowApply, as witnessed by the Haskell types Kleisli and ArrowMonad. However, there's no immediately obvious reason why you couldn't also turn an applicative functor into an arrow, so how does that leave any room for arrows to be different from applicatives? (As an aside, the fact that applicative functors have kind ⋆ → ⋆ and arrows have kind ⋆ → ⋆ → ⋆ has been a huge complication for me in trying to compare them).

Now, finally, based on the helpful replies to that StackOverflow question and the associated Reddit thread, I am confident enough to answer my own question.

Tom Ellis suggested thinking about a concrete example involving file I/O, so let's compare three approaches to it using the three typeclasses. To make things simple, we will only care about two operations: reading a string from a file and writing a string to a file. Files are going to be identified by their file path:

type FilePath = String Monadic I/OOur first I/O interface is defined as follows:

data IOM ∷ ⋆ → ⋆ instance Monad IOM readFile ∷ FilePath → IOM String writeFile ∷ FilePath → String → IOM ()Using this interface, we can for example copy a file from one path to another:

copy ∷ FilePath → FilePath → IOM () copy from to = readFile from >>= writeFile toHowever, we can do much more than that: the choice of files we manipulate can depend on effects upstream. For example, the below function takes an index file which contains a filename, and copies it to the given target directory:

copyIndirect ∷ FilePath → FilePath → IOM () copyIndirect index target = do from ← readFile index copy from (target ⟨/⟩ to)On the flip side, this means there is no way to know upfront the set of filenames that are going to be manipulated by a given value action ∷ IOM α. By "upfront", what I mean is the ability to write a pure function fileNames :: IOM α → [FilePath].

Of course, for non-IO-based monads (such as ones for which we have some kind of extractor function μ α → α), this distinction becomes a bit more fuzzy, but it still makes sense to think about trying to extract information without evaluating the effects of the monad (so for example, we could ask "what can we know about a Reader Γ α without having a value of type Γ at hand?").

The reason we can't really do static analysis in this sense on monads is because the function on the right-hand side of a bind is in the space of Haskell functions, and as such, is completely opaque.

So let's try restricting our interface to just an applicative functor.

Applicative I/Odata IOF ∷ ⋆ → ⋆ instance Applicative IOF readFile ∷ FilePath → IOF String writeFile ∷ FilePath → String → IOF ()Since IOF is not a monad, there's no way to compose readFile and writeFile, so all we can do with this interface is to either read from a file and then postprocess its contents purely, or write to a file; but there's no way to write the contents of a file into another one.

How about changing the type of writeFile?

writeFile′ ∷ FilePath → IOF (String → ())The main problem with this interface is that while it would allow writing something like

copy ∷ FilePath → FilePath → IOF () copy from to = writeFile′ to ⟨*⟩ readFile fromit leads to all kind of nasty problems because String → () is such a horrible model of writing a string to a file, since it breaks referential transparency. For example, what do you expect the contents of out.txt to be after running this program?

(λ write → [write "foo", write "bar", write "foo"]) ⟨$⟩ writeFile′ "out.txt" Two approaches to arrowized I/OFirst of all, let's get two arrow-based I/O interfaces out of the way that don't (in fact, can't) bring anything new to the table: Kleisli IOM and Applicarrow IOF.

The Kleisli-arrow of IOM, modulo currying, is:

readFile ∷ Kleisli IOM FilePath String writeFile ∷ Kleisli IOM (FilePath, String) ()Since writeFile's input still contains both the filename and the contents, we can still write copyIndirect (using arrow notation for simplicity). Note how the ArrowApply instance of Kleisli IOM is not even used.

copyIndirect ∷ Kleisli IOM (FilePath, FilePath) () copyIndirect = proc (index, target) → do from ← readFile ↢ index s ← readFile ↢ from writeFile ↢ (to, s)The Applicarrow of IOF would be:

readFile ∷ FilePath → Applicarrow IOF () String writeFile ∷ FilePath → String → Applicarrow IOF () ()which of course still exhibits the same problem of being unable to compose readFile and writeFile.

A proper arrowized I/O interfaceInstead of transforming IOM or IOF into an arrow, what if we start from scratch, and try to create something in between, in terms of where we use Haskell functions and where we make an arrow? Take the following interface:

data IOA ∷ ⋆ → ⋆ → ⋆ instance Arrow IOA readFile ∷ FilePath → IOA () String writeFile ∷ FilePath → IOA String ()Because writeFile takes the content from the input side of the arrow, we can still implement copy:

copy ∷ FilePath → FilePath → IOA () () copy from to = readFile from >>> writeFile toHowever, the other argument of writeFile is a purely functional one, and so it can't depend on the output of e.g. readFile; so copyIndirect can't be implemented with this interface.

If we turn this argument around, this also means that while we can't know in advance what will end up being written to a file (before running the full IOA pipeline itself), but we can statically determine the set of filenames that will be modified.

ConclusionMonads are opaque to static analysis, and applicative functors are poor at expressing dynamic-time data dependencies. It turns out arrows can provide a sweet spot between the two: by choosing the purely functional and the arrowized inputs carefully, it is possible to create an interface that allows for just the right interplay of dynamic behaviour and amenability to static analysis.

### ERDI Gergo: Arrow's place in the Applicative/Monad hierarchy

I've been thinking lately about arrows in relation to applicative functors and monads. The difference between the latter two is easy to intuit (and I'll describe it via an example below), but I never managed to get the same level of understanding about arrows. There's a somewhat famous paper about this question, which has a very clear-cut diagram showing that applicatives embed into arrows and arrows embed into monads (and both containments are non-trivial), which I understood as meaning every arrow is an applicative functor, and every monad is an arrow.

At first glance, this makes sense, given the well-known result that monads are exactly equivalent to arrows that are also instances of ArrowApply, as witnessed by the Haskell types Kleisli and ArrowMonad. However, there's no immediately obvious reason why you couldn't also turn an applicative functor into an arrow, so how does that leave any room for arrows to be different from applicatives? (As an aside, the fact that applicative functors have kind ⋆ → ⋆ and arrows have kind ⋆ → ⋆ → ⋆ has been a huge complication for me in trying to compare them).

Now, finally, based on the helpful replies to that StackOverflow question and the associated Reddit thread, I am confident enough to answer my own question.

Tom Ellis suggested thinking about a concrete example involving file I/O, so let's compare three approaches to it using the three typeclasses. To make things simple, we will only care about two operations: reading a string from a file and writing a string to a file. Files are going to be identified by their file path:

type FilePath = String Monadic I/OOur first I/O interface is defined as follows:

data IOM ∷ ⋆ → ⋆ instance Monad IOM readFile ∷ FilePath → IOM String writeFile ∷ FilePath → String → IOM ()Using this interface, we can for example copy a file from one path to another:

copy ∷ FilePath → FilePath → IOM () copy from to = readFile from >>= writeFile toHowever, we can do much more than that: the choice of files we manipulate can depend on effects upstream. For example, the below function takes an index file which contains a filename, and copies it to the given target directory:

copyIndirect ∷ FilePath → FilePath → IOM () copyIndirect index target = do from ← readFile index copy from (target ⟨/⟩ to)On the flip side, this means there is no way to know upfront the set of filenames that are going to be manipulated by a given value action ∷ IOM α. By "upfront", what I mean is the ability to write a pure function fileNames :: IOM α → [FilePath].

Of course, for non-IO-based monads (such as ones for which we have some kind of extractor function μ α → α), this distinction becomes a bit more fuzzy, but it still makes sense to think about trying to extract information without evaluating the effects of the monad (so for example, we could ask "what can we know about a Reader Γ α without having a value of type Γ at hand?").

The reason we can't really do static analysis in this sense on monads is because the function on the right-hand side of a bind is in the space of Haskell functions, and as such, is completely opaque.

So let's try restricting our interface to just an applicative functor.

Applicative I/Odata IOF ∷ ⋆ → ⋆ instance Applicative IOF readFile ∷ FilePath → IOF String writeFile ∷ FilePath → String → IOF ()Since IOF is not a monad, there's no way to compose readFile and writeFile, so all we can do with this interface is to either read from a file and then postprocess its contents purely, or write to a file; but there's no way to write the contents of a file into another one.

How about changing the type of writeFile?

writeFile′ ∷ FilePath → IOF (String → ())The main problem with this interface is that while it would allow writing something like

copy ∷ FilePath → FilePath → IOF () copy from to = writeFile′ to ⟨*⟩ readFile fromit leads to all kind of nasty problems because String → () is such a horrible model of writing a string to a file, since it breaks referential transparency. For example, what do you expect the contents of out.txt to be after running this program?

(λ write → [write "foo", write "bar", write "foo"]) ⟨$⟩ writeFile′ "out.txt" Two approaches to arrowized I/OFirst of all, let's get two arrow-based I/O interfaces out of the way that don't (in fact, can't) bring anything new to the table: Kleisli IOM and Applicarrow IOF.

The Kleisli-arrow of IOM, modulo currying, is:

readFile ∷ Kleisli IOM FilePath String writeFile ∷ Kleisli IOM (FilePath, String) ()Since writeFile's input still contains both the filename and the contents, we can still write copyIndirect (using arrow notation for simplicity). Note how the ArrowApply instance of Kleisli IOM is not even used.

copyIndirect ∷ Kleisli IOM (FilePath, FilePath) () copyIndirect = proc (index, target) → do from ← readFile ↢ index s ← readFile ↢ from writeFile ↢ (to, s)The Applicarrow of IOF would be:

readFile ∷ FilePath → Applicarrow IOF () String writeFile ∷ FilePath → String → Applicarrow IOF () ()which of course still exhibits the same problem of being unable to compose readFile and writeFile.

A proper arrowized I/O interfaceInstead of transforming IOM or IOF into an arrow, what if we start from scratch, and try to create something in between, in terms of where we use Haskell functions and where we make an arrow? Take the following interface:

data IOA ∷ ⋆ → ⋆ → ⋆ instance Arrow IOA readFile ∷ FilePath → IOA () String writeFile ∷ FilePath → IOA String ()Because writeFile takes the content from the input side of the arrow, we can still implement copy:

copy ∷ FilePath → FilePath → IOA () () copy from to = readFile from >>= writeFile toHowever, the other argument of writeFile is a purely functional one, and so it can't depend on the output of e.g. readFile; so copyIndirect can't be implemented with this interface.

If we turn this argument around, this also means that while we can't know in advance what will end up being written to a file (before running the full IOA pipeline itself), but we can statically determine the set of filenames that will be modified.

ConclusionMonads are opaque to static analysis, and applicative functors are poor at expressing dynamic-time data dependencies. It turns out arrows can provide a sweet spot between the two: by choosing the purely functional and the arrowized inputs carefully, it is possible to create an interface that allows for just the right interplay of dynamic behaviour and amenability to static analysis.

### regular-xmlpickler

### [ANNOUNCE] New Core Libraries Committee Members

### PPDP 2014 Call for Participation

### wren gayle romano: #done

Holy hell, things are bad for everyone.

I've started having PTSD issues again. One of my wife's coworkers got thrown in jail for 24hrs due to a domestic violence accusation (as required by Indiana state law for every accusation with any shred of evidence). Once he got out he filed for divorce because of it, to which his wife shot their son and herself and lit the house on fire— timed at 17 minutes before he was scheduled to (and did) arrive to pick up their son. An online friend of mine was dealing with a family crisis, got dumped by her fiancée, and has been on suicide watch. And now another friend is dealing with a suicide close to her

WTF world? W. T. F?

comments

### What is the state of the art in testing codegeneration?

### Why is "sum" lazy?

I understand why one might want to use foldl in a lazy manner, but if the function passed to foldl is (+), why would anyone object to using foldl'? I work in big data, and I always forget that sum is lazy and I end up getting stack size overflows, tracing all the way back to sum. If it's a teaching function (like some argue foldl is), then can we seriously consider an official, built-in prelude which is more sane? I see that idea mentioned here a lot, but can we make that dream a reality?

submitted by Pugolicious2244[link] [19 comments]

### Is there any library that deals with the creation of trees with shared branches?

Code explains better than words, so, take a look:

type TreeId = Int leaf :: Int -> TreeId isLeaf :: TreeId -> Boolean leafValue :: TreeId -> Int node :: TreeId -> TreeId -> TreeId isNode :: TreeId -> Boolean nodeLeft :: TreeId -> TreeId nodeRight :: TreeId -> TreeId toTree :: TreeId -> Tree a = node (leaf 1) (leaf 2) b = node (leaf 0) (node (leaf 1) (leaf 2)) main = do print (a == (nodeLeft b)) print (a == b)Output:

True FalseWhat is going on is the following: node and leaf are constructors for a Tree type. There is a detail, though: before creating the tree, those functions should check, in constant time, if an identical tree already exists. If it doesn't, it creates the Tree, tags it with a unique id and returns that id. If it does, instead of actually allocating more memory, it just returns the id of the existing tree. Notice that a and (left b) are the same value, because they are the same tree. This will guarantee that the heap stays compact, with not a single duplicated branch ever stored.

My questions are:

Is there a name for this?

Are there good-performing libraries available?

[link] [14 comments]