Haskell - guardian of functional purity but violator of software engineering purity?

Submitted by metaperl on Sat, 05/13/2006 - 2:07pm.

Yesterday, after 1 year of studying Haskell, I tried to write my first program involving I/O. I decided to convert a Perl program that I had written which provides lookup of state names via their 2-letter code and vice-versa. A day later, I had something written, but was worried because all of my functions were using fmap to pull things out of the IO monad. In other words, my entire library reeked of I/O.

The next day, Cale Gibbard wrote a very nice version of my program which relegated the IO to a single function.

But the thing is, you still have to use my module from the main program with the awareness that the data is coming from an IO source. Now, a basic principle of software engineering is to separate the interface from the implementation. There is no way to use the Haskell implementation of my module without full knowledge that said module is returning me data that is coming from I/O... outside of using something called unsafeIO which I know nothing about.

why does this matter?
Let's say I have a function which does something to a list of functions of type (String -> String) but one of the functions has to open a file to get its string:

sfnProc :: [(String -> String)] -> [Int]
sfnProc = map (\f -> length f)

so the problem becomes that the one function that returns a string via I/O cannot be used opaquely: the referential transparency of Haskell is killing need for opaque API of software engineering.

Submitted by Derek Elkins on Sun, 05/14/2006 - 12:08pm.

This is a well known issue (an issue that keeps some from using Haskell), but the point is that something that needs to read a file is not referentially transparent (or if you "know" it "is" using unsafePerformIO is acceptable). However, the style of programming that Haskellers use to address this is to separate the "processing" parts from the IO parts. For example, in this case you would get that string from a file at the top level and simply pass it to a pure function that needs it. Of course, if this is deep in a calling tree you can get a similar problem, which would possibly best be handled by the Reader monad or implicit parameters.

Submitted by Paul Johnson on Sun, 05/21/2006 - 11:59am.

The way you tackle this depends on what you are trying to do.

1: You could, as Derek suggests, read the string in at the top level and pass it down to the function.

2: You could use unsafePerformIO. If this string is coming from a file that is guaranteed constant during execution (like a configuration file) then this is OK, although not very elegant.

3: You could make the functions have the type (IO String -> String). Then instead of passing them a string you pass the action that returns the string. In some cases this is "return str", but in others it will be to read the appropriate file.


Submitted by yitzg on Tue, 06/20/2006 - 6:21am.

I disagree with the other replies - this is not a problem in Haskell.
In fact, when used properly, Haskell is much better at this than, say, Perl.

First of all, the kinds of cases you mention often use polymorphism. For example, you write functions with types like:

f :: Monad m => X -> m Y

So it could be using IO, or not. You could be throwing and catching exceptions (in an Error monad), or not. Etc.

But there is a deeper and more serious way in which it is Perl that is violating the separation.

Every piece of code in Perl has built into it the assumption that you are running code on a single CPU, executing one statement after the next, storing values in physical memory and retrieving them, etc.
That is a serious implementation dependency.

Your application is usually better viewed at a higher level, where some of those assumptions may not be true. For example, you may run it as a stateless web app, where there is no memory at all shared between parts of your program. Or in a GUI or other concurrent framework, where the order in which things will happen is not defined in advance.

Haskell makes it easy to separate out those kinds of implementation dependencies.

Haskell programs are naturally separated between three types of calculations:

  1. Those that inherently must depend on the physical environment in which they are running.
  2. Those that must share some kind of state with other parts of the program.
  3. Calculations that require neither of the above.

You do as much as you can at levels 3 and 2 - but staying polymorphic, so you can play nicely with any stuff from lower levels.

If you need to keep state, you don't say "store that in a variable in memory", which is level 1. You say "I need to share some state", which is level 2. Your level 1 code then says where the state comes from and where it goes.

Haskell APIs tend to be at level 3 or 2, unless they are specifically dealing with hardware. So their separation from implementation is much better than APIs in Perl or other "procedural" languages.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.