# The most important thing you can know about Haskell and Functional Programming

Submitted by metaperl on Thu, 10/06/2005 - 6:16pm.
I bought "The Haskell Road to Logic, Maths, and Programming" but never looked at it until recently. Even though I had gone through 16 chapters of Simon Thompson's book, I had failed to grasp just what Haskell was about but I knew there was something that I was missing. And then I saw it in Section 1.9 of "The Haskell Road..." entitled "Haskell Equations and Equational Reasoning"

The Haskell equations f x y = ... used in the definition of a function f are genuine mathematical equations. They state that the left hand side of the right hand side of the equation have the same value. This is very different from the use of = in imperative languages like C or Java. In a C or Java program the statement x = x*y does not mean that x and x*y have the same value, but rather it is a command to throw away the old value of x and put the value of x*y in its place. It is a so-called destructive assignment statement: the old value of a variable is destroyed and replaced by a new one.

Reasoning about Haskell definitions is a lot easier than reasoning about programs that use destructive assignment. In Haskell, standard reasoning about mathematical equations applies. E.g. after the Haskell declarations x= 1 and y = 2, the Haskell declaration x = x + y will raise an error "x" multiply defined. ... = in Haskell has the meaning "is by definition equal to"...

This was a huge landslide victory for me. Because I quit trying to write programs to get data here, data there. Values here, values there. Instead, I simply began to rewrite the original function as a new definition.

I became so confident that I was able to write a program to return all the leaves of a tree. and here it is:

data Tree a = Empty | Node a (Tree a) (Tree a) -- leaves takes a tree and an empty list and returns a list of leaves -- of the tree leaves :: Tree a -&gt; [a] -&gt; [a] leaves tree lis | empty tree = lis -- an empty tree is just the leaves so far -- add on current node if it is terminal.. NO! scratch that! no add -- on! That is an action. We are simply rewriting leaves tree lis -- as something else based on what we found out about leaves tree lis | terminal currentNode = currentNode tree : lis | empty rightBranch = leaves (leftBranch tree) lis | empty leftBranch = leaves (rightBranch tree) lis | otherwise = leaves (leftBranch tree) lis ++ leaves (rightBranch tree) lis

Looking back at "Algorithms in Haskell" by Rabhi and "Craft of FP" by Simon Thompson, they do both make this same statement, but somehow it never really hit me right.

One of my favorite spiritual quotes is actual said by a number of people. I think it covers what I was frantically trying to do in imperative languages and what Haskell never does:

Nothing is happening. Nothing ever has happened. Nothing ever will happen. All that has occurred is me being aware of the dance of light on my consciousness (or something like that).
-- EJ Gold, the American Book of the Dead

"In Reality, nothing happens! It is a great gift to be able to understand this; if you perceive this, you are blessed, for inner vision has been granted to you."

~Ma Anandamayi

'Whatever you experience, realize that all of it is simply the unobstructed play of your own mind.' -Tsele Natsog Rangdrol

In the quantum mechanical model nothing ever happens!

-- http://www.mtnmath.com/cat.html

It seems more natural to me to define leaves in such a way that it is of type Tree a -> [a].

Then you can simple collect the leaves in a bottom-up fashion:

leaves Empty        = []
leaves (Node x l r) = x : leaves l ++ leaves r

(Note that the use (++) of is actually quite expensive. One usually circumvents this by introducing a so-called accumulating parameter, very much like you did:

leaves :: Tree a -> [a]
leaves =  flip leaves' []
where
leaves' Empty        = id
leaves' (Node x l r) = (x :) . leaves' l . leaves' r

)

Why should ++ be expensive? I thought that a list was implemented internally as a lazy linked list, and I would have thought that ++ should be cheap, just like : is.

xs ++ ys takes time proportional to the size of xs, rendering the running time of a naive version of leaves quadratic.

++ is defined as (in module GHC.Base):

(++) :: [a] -> [a] -> [a]
(++) [] ys = ys
(++) (x:xs) ys = x : xs ++ ys

So it adds a constant overhead to the lazy evaluation of (:) in the first part of the combined list (i.e. the pattern match to decide between the two cases). Unless I'm misunderstanding something this is not proportional to the sizes of the lists in the concatentation.

(x : xs) ++ ys = x : (xs ++ ys)

Note that the two appearances of the cons constructor actually refer to different nodes here: you're rebuilding the first list. So, you evaluate and reconstruct every cons node in the first argument list. Clearly, this takes time proportional to the length of the first argument list. Admittedly, implementations, e.g. GHC, do offer some ad-hoc optimizations.

Perhaps, I'm not explaining this too clear. I'm sorry for that. However, although it has been a while for me, as far as I can remember, this stuff is convered in almost all beginner's tutorials on the language.

Consult, for instance, Section 8.3 of A Gentle Introduction to Haskell. Furthermore, the text book on algorithms in functional languages by Rabhi and LaPalme as well as Okasaki's standard work on Purely Functional Data Structures do a good job in explaining how to analyze the efficiency of functional programs.

Well, that's if you are using lists. Functional dequeues or something else could be better...

Another very simple example is the standard way to transform a tree into a list:

inorder Empty = []
inorder (Node l x r) = inorder l ++ [x] ++ inorder r

Again we have quadratic running time, since ++ traverses and copies all of its left argument. Perhaps the reader is trained to transform that code to make it faster: one can use an accumulator to replace ++. But the simpler solution is to use a better Sequence implementation. The beauty of the program is preserved, and the program becomes asymptotically optimal. It's a pity that students are usually thaught the other way. How then should they learn Reuse and Abstraction?