News aggregator

preview.getprismatic.com

del.icio.us/haskell - Mon, 01/06/2014 - 8:45pm
Categories: Offsite Blogs

Small, embeddable Haskell?

Haskell on Reddit - Mon, 01/06/2014 - 8:03pm

I'm interested in Haskell (downloaded it a couple of years ago), but I'd like it in a form similar to a Lua release: a bunch of C source code you can throw into your project and make your own interpreter around. Does such a Haskell release exist?

Edit: What I would like is a C API that looks roughly like this:

/ * Initialising and closing a particular Haskell session */

HaskellState * hsInit();

hsClose(HaskellState *);

/* Executing a command in a specific Haskell session */

void hsExecute(HaskellState *, char *);

/* Registering a symbol in Haskell which will call the C function specified by HaskellCallbackFn when Haskell calls it */

void hsRegister(HaskellState *, char *, HaskellCallbackFn);

Edit 2: I just want the functionality of the ghci interpreter in my C app, with the ability to add callbacks.

submitted by encabule
[link] [6 comments]
Categories: Incoming News

Associated types, kind constraints & typelits

haskell-cafe - Mon, 01/06/2014 - 7:06pm
All, While toying with typelits I wanted to get the following to work, but failed to. Is it intended not to work at all, should I change something to get it to work, or is this something which needs some GHC support? Note I obviously tried to add the constraint mentioned in the compiler error, but failed to: I seem to add too many type arguments to SingI, which somewhat puzzles me. Thanks, Nicolas {-# LANGUAGE TypeFamilies, DataKinds #-} module Main where import GHC.TypeLits class C a where type T a :: Nat data D = D instance C D where type T D = 10 {- This is not allowed, as intended: data E = E instance C E where type T E = Int -}
Categories: Offsite Discussion

Mateusz Kowalczyk: Hackage documentation v2

Planet Haskell - Mon, 01/06/2014 - 5:18pm
Posted on January 6, 2014 by Fūzetsu

This is just a quick follow up to my yesterday’s post (Actually looking at the date, it was this morning. My sleep ‘schedule’ is all over.).

Here are a few issues with yesterday’s post:

  1. My instructions stated cd dist/doc instead of cd dist/doc/html
  2. Documentation generated with my instructions would not have package cross-linking (very important). The links would have your local file-system paths instead.
  3. Documentation generated with my instructions would not have the /proper/ Contents page when clicked on a link from documentation. It would still have the Contents page, just not the one generated by Hackage.

Here are the fixes to each problem:

  1. Replace the bad cd path. I have reflected this in the small Bash script I published yesterday as a Gist and as a file. I did hear that some people were struggling with it a bit (such as on OS X) so if it doesn’t work out of the box, please make educated edits as you see fit.

  2. You can generate links to the packages with docs already on Hackage by passing an extra flag to cabal haddock. It is

    --html-location='http://hackage.haskell.org/package/$pkg/docs'

    Make sure you include $pkg verbatim, Cabal handles it automatically.

  3. As above, we just need to add an extra cabal haddock flag:

    --contents-location='http://hackage.haskell.org/package/$pkg'

If there are any further mistakes, please let me know and I’ll post up corrections.

If you have a better script for doing the uploads, also let me know and I’ll be happy to post it up if you want me to. Easy improvements would be to get the package version and package name from the cabal file.

Fan service.

Categories: Offsite Blogs

preview.getprismatic.com

del.icio.us/haskell - Mon, 01/06/2014 - 5:11pm
Categories: Offsite Blogs

Haskell Job Opportunity

Haskell on Reddit - Mon, 01/06/2014 - 4:14pm

AlephCloud is an early stage Silicon Valley startup creating a secure cloud content management system. Haskell is our main server side language and we are looking to hire a few more Haskell programmers. Some topics of interest to us are: cloud services, cryptography/security, cross compilation (javascript, iOS, ARM). We are somewhat flexible about location, though we'd prefer people on the west coast of the US.

If interested feel free to contact me (jeff@alephcloud.com) for more information, or send a message (and/or a resume) to resume@alephcloud.com.

thanks, Jeff

submitted by jeffpatalephcloud
[link] [16 comments]
Categories: Incoming News

Erik de Castro Lopo: When QuickCheck Fails Me

Planet Haskell - Mon, 01/06/2014 - 3:54pm

This is an old trick I picked up from a colleague over a decade ago and have re-invented or re-remembered a number of times since.

When implementing complicated performance critical algorithms and things don't work immediately, the best idea is to drop back to the old formula of:

  • Make it compile.
  • Make if correct.
  • Make if fast.

Often than means implementing slow naive versions of parts of the algorithm first and then one-by-one replacing the slow versions with fast versions. For a given function of two inputs, this might give me two functions with the identical type signatures:

functionSlow :: A -> B -> C functionFast :: A -> B -> C

that can be used interchangeably.

When it comes to implementing the fast versions, the slow versions can be used to check the correct-ness of the fast version using a simple QuickCheck property like:

\ a b -> functionFast a b == functionSlow a b

This property basically just asks QuickCheck to generate a, b pairs, pass these to both functions and comare their outputs.

With something like this, QuickCheck usually all the corner cases really quickly. Except for when it doesn't. QuickCheck uses a random number generator to generate inputs to the function under test. If for instance you have a function that takes a floating point number and only behaves incorrectly when that input is say exactly 10.3, the chances of QuickCheck generating exactly 10.3 and hitting the bug are very small.

For exactly this reason, using the technique above sometimes doesn't work. Sometimes the the fast version has a bug that Quickcheck wasn't able to find.

When this happens the trick is to write a third function:

functionChecked :: A -> B -> C functionChecked a b = let fast = functionFast a b slow = functionSlow a b in if fast == slow then fast else error $ "functionFast " ++ show a ++ " " ++ show b ++ "\nreturns " ++ show fast ++ "\n should be " ++ show slow

which calculates the function output using both the slow and the fast versions, compares the outputs and fails with an error if the two function outputs are not identical.

Using this in my algorithm I can then collect failing test cases that QuickCheck couldn't find. With a failing test case, its usually pretty easy to fix the broken fast version of the function.

Categories: Offsite Blogs

Neil Mitchell: Optimising Haskell for a tight inner loop

Planet Haskell - Mon, 01/06/2014 - 2:50pm

Summary: I walk through optimising a Haskell string splitter to get a nice tight inner loop. We look at the Haskell code, the generated Core, C-- and assembly. We get down to 6 assembly instructions per input character.

Let's start with some simple code:

break (`elem` " \r\n$") src

This code scans a string looking for a space, newline or $ and returns the string before and the string after. Our goal is to make this code faster - by the end we'll get down to 6 assembly instructions per input character. Before making things go faster we should write test cases (so we don't break anything), profile (so we are optimising the right thing) and write benchmarks (to check our changes make things go faster). To write this post, I did all those steps, but the post is only going to look at the generated Core, C-- and assembly - and be guided by guesses about what should go faster. The complete code is available online, along with the Core/C--/assembly for each step as produced by GHC 7.6.3.

Version 1

To turn our example into a complete program, we write:

module InnerLoop(innerLoop) where

innerLoop :: FilePath -> IO (String, String)
innerLoop file = do
src <- readFile file
return $ break test src

test x = x `elem` " \r\n$"

We can save this code as InnerLoop.hs and compile it with:

ghc -c -O2 InnerLoop.hs -ddump-simpl -ddump-cmm -ddump-asm > log.txt

The full output of log.txt is available here. It contains the GHC Core (which looks a bit like Haskell), then the C-- (which looks a bit like C) and finally the assembly code (which looks exactly like assembly). When optimising we usually look at the Core, then at the C--, then at the assembly - stopping whenever our profiling says we are done. Let's take a look at the inner loop in Core (with some light editing):

innerLoop_3 = GHC.CString.unpackCString# " \r\n\$"

test_1 = \ (x :: GHC.Types.Char) ->
GHC.List.elem @ GHC.Types.Char GHC.Classes.$fEqChar x innerLoop_3

innerLoop_2 =
...
case GHC.List.$wbreak @ GHC.Types.Char test_1 x of _
(# a, b #) -> (a, b)
...

The best way to read the Core is by looking for what you can understand, and ignoring the rest - it contains a lot of boring detail. We can see that a lot of things are fully qualified, e.g. GHC.List.elem. Some things have also been a bit mangled, e.g. $wbreak, which is roughly break. The interesting thing here is that break is being passed test_1. Looking at test_1 (which will be called on each character), we can see we are passing $fEqChar - a pair containing a function of how to perform equality on characters - to the elem function. For each character we are going to end up looping through a 4 element list (innerLoop_3) and each comparison will be going through a higher order function. Clearly we need to improve our test function.

Version 2

We can unroll the elem in test to give:

test x = x == ' ' || x == '\r' || x == '\n' || x == '$'

Compiling again and looking at the Core we see:

test_2 =
\ (x :: GHC.Types.Char) ->
case x of _ { GHC.Types.C# c ->
case c of _ {
__DEFAULT -> GHC.Types.False;
'\n' -> GHC.Types.True;
'\r' -> GHC.Types.True;
' ' -> GHC.Types.True;
'$' -> GHC.Types.True
}
}

Now for each character we extract the raw character (pattern matching against C#) then test it against the possibilities. GHC has optimised our repeated ==/|| into a nice case expression. It looks quite nice. Now the bottleneck is the break function.

Version 3

The break function is working on a String, which is stored as a linked list of characters. To get better performance we can move to ByteString, writing:

innerLoop :: FilePath -> IO (ByteString, ByteString)
innerLoop file = do
src <- BS.readFile file
return $ BS.break test src

For many people this is the reasonable-performance version they should stick with. However, let's look at the Core once more:

go = \ (a :: Addr#) (i :: Int#) (w :: State# RealWorld) ->
case i >=# len of _ {
GHC.Types.False ->
case readWord8OffAddr# @ GHC.Prim.RealWorld a 0 w
of _ { (# w, c #) ->
case chr# (word2Int# c) of _ {
__DEFAULT -> go (plusAddr# a 1) (i +# 1) w;
'\n' -> (# w, GHC.Types.I# i #);
'\r' -> (# w, GHC.Types.I# i #);
' ' -> (# w, GHC.Types.I# i #);
'$' -> (# w, GHC.Types.I# i #)
}
};
GHC.Types.True -> (# w, l_a1J9 #)
}

The first thing that should strike you is the large number of # symbols. In Core, a # means you are doing strict primitive operations on unboxed values, so if the optimiser has managed to get down to # that is good. You'll also notice values of type State# RealWorld which I've renamed w - these are an encoding of the IO monad, but have zero runtime cost, and can be ignored. Looking at the rest of the code, we have a loop with a pointer to the current character (a :: Addr#) and an index of how far through the buffer we are (i :: Int#). At each character we first test if the index exceeds the length, and if it doesn't, read a character and match it against the options. If it doesn't match we continue by adding 1 to the address and 1 to the index. Of course, having to loop over two values is a bit unfortunate.

Version 4

A ByteString needs an explicit length so it knows when it has come to the end of the buffer, so needs to keep comparing against explicit lengths (and for efficiency reasons, also maintaining those lengths). Looking to C for inspiration, typically strings are terminated by a \0 character, which allows looping without comparing against a length (assuming the source file does not contain \0). We can define our own null-terminated ByteString type with a break operation:

newtype ByteString0 = BS0 ByteString

readFile0 :: FilePath -> IO ByteString0
readFile0 x = do
src <- BS.readFile x
return $ BS0 $ src `BS.snoc` '\0'

We define a newtype wrapper around ByteString so we gain some type safety. We also define a readFile0 that reads a file as a ByteString0, by explicitly calling snoc with \0. We can now define our own break0 function (this is the only big chunk of Haskell in this article):

break0 :: (Char -> Bool) -> ByteString0 -> (ByteString, ByteString0)
break0 f (BS0 bs) = (BS.unsafeTake i bs, BS0 $ BS.unsafeDrop i bs)
where
i = Internal.inlinePerformIO $ BS.unsafeUseAsCString bs $ \ptr -> do
let start = castPtr ptr :: Ptr Word8
let end = go start
return $! end `minusPtr` start

go s | c == '\0' || f c = s
| otherwise = go $ inc s
where c = chr s

chr :: Ptr Word8 -> Char
chr x = Internal.w2c $ Internal.inlinePerformIO $ peek x

inc :: Ptr Word8 -> Ptr Word8
inc x = x `plusPtr` 1

We define break0 by finding the position at which the condition stops being true (i) and calling unsafeTake/unsafeDrop to slice out the relevant pieces. Because we know the second part is still null terminated we can rewrap in ByteString0. To find the index, we mostly use code copied from the bytestring library and modified. We convert the ByteString to a Ptr CChar using unsafeUseAsCString which just lets us look at the internals of the ByteString. We then loop over the pointer with go until we get to the first character that passes f and find how far we travelled. The function go looks at the current character using chr, and if it's \0 (the end) or the function f passes, returns the address at this point. Otherwise it increments the pointer. We use chr to peek at the pointer directly, and inlinePerformIO to do so purely and fast - since we know these buffers are never modified, the inlinePerformIO is morally defensible (we could have put chr in IO but that breaks a future optimisation we'll need to do).

Compiling to Core we see:

go = \ (x :: GHC.Prim.Addr#) ->
case readWord8OffAddr# @ RealWorld x 0 realWorld#
of _ { (# _, c #) ->
case GHC.Prim.chr# (GHC.Prim.word2Int# c) of _ {
__DEFAULT -> go (GHC.Prim.plusAddr# x 1);
'\NUL' -> GHC.Ptr.Ptr @ GHC.Word.Word8 x;
'\n' -> GHC.Ptr.Ptr @ GHC.Word.Word8 x;
'\r' -> GHC.Ptr.Ptr @ GHC.Word.Word8 x;
' ' -> GHC.Ptr.Ptr @ GHC.Word.Word8 x;
'$' -> GHC.Ptr.Ptr @ GHC.Word.Word8 x
}

Now we have a Core inner loop to be proud of. We loop round with a single pointer, peek at a byte, and compare it to our options. Time to look onwards to the C--, where I've included just the inner loop:

InnerLoop.$wgo_info()
c1Tt:
Hp = Hp + 8;
if (Hp > HpLim) goto c1Tx;
_s1RN::I32 = %MO_UU_Conv_W8_W32(I8[I32[Sp + 0]]);
_s1T5::I32 = _s1RN::I32;
_s1T6::I32 = _s1T5::I32;
if (_s1T6::I32 < 13) goto c1TG;
if (_s1T6::I32 < 32) goto c1TH;
if (_s1T6::I32 < 36) goto c1TI;
if (_s1T6::I32 != 36) goto c1TJ;
...
...
c1TJ:
_s1T4::I32 = I32[Sp + 0] + 1;
I32[Sp + 0] = _s1T4::I32;
Hp = Hp - 8;
jump InnerLoop.$wgo_info; // []
...

Reading the code, we first mess around with Hp, then pull a value out of the array and into _s1RN, then do some comparisons, and if they don't match jump to c1TJ, mess around with Hp again and jump back to start again.

There are three obvious problems with the code: 1) we mess around with Hp; 2) we are doing too many tests to get to the default case; 3) there is a jump in the middle of the loop.

Version 5

Let's start with the Hp variable. Hp is the heap pointer, which says how much heap GHC is using - if the heap gets above a certain limit, it triggers a garbage collection. The Hp = Hp + 8 reserves 8 bytes of heap for this function, Hp > HpLim checks if we need to garbage collect, and Hp = Hp - 8 at the bottom of the loop gives back that heap space. Why do we allocate 8 bytes, only to give it back at the end? The reason is that in the return path after the loop we do allocation. It's a long standing performance issue that GHC doesn't push the heap test down to the exit path, but we can fix it ourselves. Looking at the Core, we saw:

case GHC.Prim.chr# (GHC.Prim.word2Int# c) of _ {
__DEFAULT -> go (GHC.Prim.plusAddr# x 1);
'\NUL' -> GHC.Ptr.Ptr @ GHC.Word.Word8 x;

The expression GHC.Ptr.Ptr @ GHC.Word8 x is allocating a constructor around the pointer to return. Looking at the Ptr type we discover:

data Ptr a = Ptr Addr#

So Ptr is simply a constructor wrapping our address. To avoid the Ptr in the inner loop, we can switch to returning Addr# from go:

i = Internal.inlinePerformIO $ BS.unsafeUseAsCString bs $ \ptr -> do
let start = castPtr ptr :: Ptr Word8
let end = go start
return $! Ptr end `minusPtr` start

go s@(Ptr a) | c == '\0' || f c = a
| otherwise = go $ inc s
where c = chr s

We also add back the Ptr around end do call minusPtr. Looking at the Core we now see a very simple return path:

case GHC.Prim.chr# (GHC.Prim.word2Int# ipv1_a1D0) of _ {
__DEFAULT -> InnerLoop.$wgo (GHC.Prim.plusAddr# ww_s1PR 1);
'\NUL' -> ww_s1PR;

And dropping down to C-- we see:

c1Td:
_s1Ry::I32 = %MO_UU_Conv_W8_W32(I8[I32[Sp + 0]]);
_s1SP::I32 = _s1Ry::I32;
_s1SQ::I32 = _s1SP::I32;
if (_s1SQ::I32 < 13) goto c1Tn;
if (_s1SQ::I32 < 32) goto c1To;
if (_s1SQ::I32 < 36) goto c1Tp;
if (_s1SQ::I32 != 36) goto c1Tq;
R1 = I32[Sp + 0];
Sp = Sp + 4;
jump (I32[Sp + 0]); // [R1]
c1Tq:
_s1SO::I32 = I32[Sp + 0] + 1;
I32[Sp + 0] = _s1SO::I32;
jump InnerLoop.$wgo_info; // []

Not a single mention of Hp. We still have a lot more tests than we'd like though.

Version 6

The current code to check for our 5 terminating characters compares each character one by one. This entire example is based on lexing Ninja source files, so we know that most characters will be alphanumeric. Using this information, we can instead test if the character is less than or equal to $, if it is we can test for the different possibilities, otherwise continue on the fast path. We can write:

test x = x <= '$' && (x == ' ' || x == '\r' || x == '\n' || x == '$')

Now looking at the Core we see:

go = \ (ww_s1Qt :: GHC.Prim.Addr#) ->
case GHC.Prim.readWord8OffAddr#
@ GHC.Prim.RealWorld ww_s1Qt 0 GHC.Prim.realWorld#
of _ { (# _, ipv1_a1Dr #) ->
case GHC.Prim.chr# (GHC.Prim.word2Int# ipv1_a1Dr) of wild_XH {
__DEFAULT ->
case GHC.Prim.leChar# wild_XH '$' of _ {
GHC.Types.False -> go (GHC.Prim.plusAddr# ww_s1Qt 1);
GHC.Types.True ->
case wild_XH of _ {
__DEFAULT -> go (GHC.Prim.plusAddr# ww_s1Qt 1);
'\n' -> ww_s1Qt;
'\r' -> ww_s1Qt;
' ' -> ww_s1Qt;
'$' -> ww_s1Qt
}
};
'\NUL' -> ww_s1Qt
}
}

The code looks reasonable, but the final \NUL indicates that the code first checks if the character is \NUL (or \0) and only then does our fast < $ test.

Version 7

To perform our < $ test before checking for \0 we need to modify go. We require that the argument predicate must return False on \0 (otherwise we'll run off the end of the string) and can then write:

go s@(Ptr a) | f c = a
| otherwise = go $ inc s
where c = chr s

test x = x <= '$' &&
(x == ' ' || x == '\r' || x == '\n' || x == '$' || x == '\0')

The Core reads:

InnerLoop.$wgo =
\ (ww_s1Qq :: GHC.Prim.Addr#) ->
case GHC.Prim.readWord8OffAddr#
@ GHC.Prim.RealWorld ww_s1Qq 0 GHC.Prim.realWorld#
of _ { (# _, ipv1_a1Dr #) ->
let {
c1_a1uU [Dmd=Just L] :: GHC.Prim.Char#
[LclId, Str=DmdType]
c1_a1uU = GHC.Prim.chr# (GHC.Prim.word2Int# ipv1_a1Dr) } in
case GHC.Prim.leChar# c1_a1uU '$' of _ {
GHC.Types.False -> InnerLoop.$wgo (GHC.Prim.plusAddr# ww_s1Qq 1);
GHC.Types.True ->
case c1_a1uU of _ {
__DEFAULT -> InnerLoop.$wgo (GHC.Prim.plusAddr# ww_s1Qq 1);
'\NUL' -> ww_s1Qq;
'\n' -> ww_s1Qq;
'\r' -> ww_s1Qq;
' ' -> ww_s1Qq;
'$' -> ww_s1Qq
}
}
}

The C-- reads:

InnerLoop.$wgo_info()
c1Uf:
_s1Se::I32 = %MO_UU_Conv_W8_W32(I8[I32[Sp + 0]]);
_s1Sh::I32 = _s1Se::I32;
_s1Sg::I32 = _s1Sh::I32;
_c1TZ::I32 = _s1Sg::I32 <= 36;
;
if (_c1TZ::I32 >= 1) goto c1Ui;
_s1Ty::I32 = I32[Sp + 0] + 1;
I32[Sp + 0] = _s1Ty::I32;
jump InnerLoop.$wgo_info; // []

And the assembly reads:

InnerLoop.$wgo_info:
_c1Uf:
movl 0(%ebp),%eax
movzbl (%eax),%eax
cmpl $36,%eax
jbe _c1Ui
incl 0(%ebp)
jmp InnerLoop.$wgo_info

We have ended up with a fairly small 6 instruction loop.

Version 8

We've now exhausted my Haskell bag of tricks, and have to stop. But the assembly code could still be improved. In each loop we read the contents of the memory at %ebp into %eax, and increment the contents of the memory at %ebp at the end - we're manipulating the value on the top of the stack (which is pointed to by %ebp). We could instead cache that value in %ebx, and write:

_c1Uf:
movzbl (%ebx),%eax
cmpl $36,%eax
jbe _c1Ui
incl %ebx
jmp _c1Uf

One less instruction, two less memory accesses. I tried the LLVM backend (using LLVM 2.8 at -O3), but it generated significantly worse assembly code. I don't know how to optimise any further without dropping down to the C FFI, but I'm sure one day GHC/LLVM will automatically produce the shorter assembly code.

Update: It appears on 64bit x86 GHC already produces the minimal assembly.

Categories: Offsite Blogs

preprocessor, how to?

haskell-cafe - Mon, 01/06/2014 - 10:13am
In a program I use a lot of hamlet files. All these hamlet files are in a submap of the source map. In code: allHtml = Mconcat [ $(hamletFile "hamlet\\belastingen.hamlet") renderUrl ,$(hamletFile "hamlet\\winddruk.hamlet") renderUrl .. ,$(hamletFile "hamlet\\berekeningDruk.hamlet") renderUrl ,$(hamletFile "hamlet\\drukNaarBuigTrek.hamlet") renderUrl] This is a Windows program. For another project I want to use this piece of code on Linux. The code should now be: .. $(hamletFile "hamlet/winddruk.hamlet") renderUrl I can't use pathSeparator here, because this is all happening at compile time. So I think I need a preprocessor. I would like to have a kind of hamletDir macro and change the code to something like: #ifdef Win32 hamletDir = "hamlet\\" #else hamletDir = "hamlet/" #endif ... $(hamletFile "{hamletDir}winddruk.hamlet") renderUrl Is it possible in Haskell and how do I it? All examples of prepr
Categories: Offsite Discussion

Win32-2.3.0.0 not on Hackage

libraries list - Mon, 01/06/2014 - 8:11am
Hi all, I've noticed that Win32-2.3.0.0 (that ships with GHC 7.6.3) is not on Hackage, even though the previous versions are. Can someone who has the appropriate permissions please upload it? The 2.3.0.0 version corresponds to the current state of the ghc-7.6 branch in the source repository [1]. [1] http://git.haskell.org/packages/Win32.git
Categories: Offsite Discussion

Home - Metasepi

del.icio.us/haskell - Mon, 01/06/2014 - 7:50am
Categories: Offsite Blogs

hackage build failure (?) - but no log

haskell-cafe - Mon, 01/06/2014 - 5:54am
Mateusz Kowalczyk's recent message about fixing hackage documentation made me realize that one of my packages (chatter) has no build log, and indeed, the documentation was not showing up on hackage. Is there any way to figure out what is causing the failure? Chatter has no non-Haskell dependencies, and it builds just fine in a sandbox on the major three architectures (including building documentation on at least two of them - I haven't tested haddock on Windows). Manually posting the documentation is more than a little inconvenient at the moment, and given the nature of the library, I'm pretty surprised this is an issue. (I did post the docs for chatter-0.0.0.3 manually, if you're wondering why they appear despite my complaints...) Does anyone have suggestions? Thanks! Rogan _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe< at >haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Categories: Offsite Discussion

Erik de Castro Lopo: When QuickCheck Fails Me

Planet Haskell - Mon, 01/06/2014 - 4:45am

This is an old trick I picked up from a colleague over a decade ago and have re-invented or re-remembered a number of times since.

When implementing complicated performance critical algorithms and things don't work immediately, the best idea is to drop back to the old formula of:

  • Make it compile.
  • Make if correct.
  • Make if fast.

Often than means implementing slow naive versions of parts of the algorithm first and then one-by-one replacing the slow versions with fast versions. For a given function of two inputs, this might give me two functions with the identical type signatures:

functionSlow :: A -> B -> C functionFast :: A -> B -> C

that can be used interchangeably.

When it comes to implementing the fast versions, the slow versions can be used to check the correct-ness of the fast version using a simple QuickCheck property like:

\ a b -> functionFast a b == functionSlow a b

This property basically just asks QuickCheck to generate a, b pairs, pass these to both functions and comare their outputs.

With something like this, QuickCheck usually all the corner cases really quickly. Except for when it doesn't. QuickCheck uses a random number generator to generate inputs to the function under test. If for instance you have a function that takes a floating point number and only behaves incorrectly when that input is say exactly 10.3, the chances of QuickCheck generating exactly 10.3 and hitting the bug are very small.

For exactly this reason, using the technique above sometimes doesn't work. Sometimes the the fast version has a bug that Quickcheck wasn't able to find.

When this happens the trick is to write a third function:

functionChecked :: A -> B -> C functionChecked a b = let fast = functionFast a b slow = functionFast a b in if fast == slow then fast else error $ "functionFast " ++ show a ++ " " ++ show b ++ "\nreturns " ++ show fast ++ "\n should be " ++ show slow

which calculates the function output using both the slow and the fast versions, compares the outputs and fails with an error if the two function outputs are not identical.

Using this in my algorithm I can then collect failing test cases that QuickCheck couldn't find. With a failing test case, its usually pretty easy to fix the broken fast version of the function.

Categories: Offsite Blogs

Please fix documentation for you Hackage packages!

haskell-cafe - Mon, 01/06/2014 - 3:37am
Greetings café, As some of you might have noticed recently, there seems to be quite a few packages with broken documentation on Hackage recently. If you are an owner of such package, please consider fixing it. There's a thread on cabal-devel about this if you're interested in details. Here's a list of packages uploaded since beginning of 2013 for which the documentation was broken as of yesterday: http://fuuzetsu.co.uk/misc/sorted.txt If your package is on that list, your documentation is broken. Only the most recent versions of packages were being considered. I outline how to fix your documentation (in most cases this means uploading it by hand) in a blog post I just published. Please refer to: http://fuuzetsu.co.uk/blog/posts/2014-01-06-Fix-your-Hackage-documentation.html The post contains a link to as script which naively attempts to automate the burden of uploading the docs manually. If your package can't be built simply with ‘cabal configure && cabal build && cabal haddock --hyperlink-source’,
Categories: Offsite Discussion

Parallel building multiple targets

glasgow-user - Sun, 01/05/2014 - 11:47pm
Hi, I have a Haskell project where a number of executables are produced from mostly the same modules. I'm using a Makefile to enable parallel builds. I received advice[1] that ghc -M is broken, but that there is parallel ghc --make in HEAD. As far as I can tell, ghc --make does not allow building several binaries in one run, so I think it may not still be a full replacement for Makefiles. However I have a question about ghc --make that is also relevant without parallel ghc --make: If I have two main modules, prog1.hs and prog2.hs, which have mutual dependencies (for example, both import A from A.hs), is it safe to run "ghc --make prog1" in parallel with "ghc --make prog2"? IOW, is there some kind of locking to prevent both from building module A at the same time and interfering with each other? Is there a good way (either in current releases or HEAD) to build multiple binaries partially from the same sources in parallel? Sami [1] http://stackoverflow.com/questions/20938894/generating-correct-link-de
Categories: Offsite Discussion