Improving Haskell MD5

Submitted by TomMD on Fri, 11/09/2007 - 7:42pm.

I cleaned up the code!

  • Fixed the ugly 'blocks' function. This results in better performance and fixed the space leak!
  • Changed the license from GPL2 to BSD3.
  • Eliminating the 'finalize' doesn't give any further performance, so I think this was a really odd bug in my code, not GHC.

    Again, the code is at:
    darcs get http://code.haskell.org/~tommd/pureMD5

    And here is the quick and dirty performance of the unrolled version:

    $ time ./hsMD5 ../2GB
    b8f1f1faa6dda5426abffb3a7811c1fb

    real 1m22.685s
    user 0m31.730s
    sys 0m4.676s

    $ time md5sum ../2GB
    b8f1f1faa6dda5426abffb3a7811c1fb ../2GB

    real 1m21.343s
    user 0m25.330s
    sys 0m4.528s

    So, at least on my clunker, the Haskell code shows good time on the 2GB test.


    $ time ./hsMD5 ../200MB
    eda9a9889837ac4bc81d6387d92c1bec

    real 0m2.267s
    user 0m2.148s
    sys 0m0.116s

    $ time md5sum ../200MB
    eda9a9889837ac4bc81d6387d92c1bec ../200MB

    real 0m0.775s
    user 0m0.704s
    sys 0m0.068s

    And its just under 3 time slower than C on that one.