News aggregator

Haskell and RabbitMQ

del.icio.us/haskell - Wed, 01/07/2015 - 5:39am
Categories: Offsite Blogs

FP Complete: Announcing: mutable-containers 0.2

Planet Haskell - Wed, 01/07/2015 - 5:00am

As part of our high-performance computing work, I recently found myself in need of some fast mutable containers. The code is now available on both Hackage and Stackage. The code is pretty young, and is open to design changes still. That said, the currently released version (0.2.0) is well tested and performs fairly well. If there are ideas for improvement, please let me know!

Below is the content of the README file, which gives a good overview of the package, as well as benchmark numbers and test coverage statistics (spoiler: 100%). As always, you can see the README on Github or on Stackage.

One of Haskell's strengths is immutable data structures. These structures make it easier to reason about code, simplify concurrency and parallelism, and in some cases can improve performance by allowing sharing. However, there are still classes of problems where mutable data structures can both be more convenient, and provide a performance boost. This library is meant to provide such structures in a performant, well tested way. It also provides a simple abstraction over such data structures via typeclasses.

Before anything else, let me provide the caveats of this package:

  • Don't use this package unless you have a good reason to! Immutable data structures are a better approach most of the time!
  • This code is intentionally not multithread safe. If you need something like a concurrent queue, there are many options on Hackage, from Chan to TChan, to chaselev-deque.

We'll first talk about the general approach to APIs in this package. Next, there are two main sets of abstractions provided, which we'll cover in the following two sections, along with their concrete implementations. Finally, we'll cover benchmarks.

API structure

The API takes heavy advantage of the PrimMonad typeclass from the primitive package. This allows our data structures to work in both IO and ST code. Each data structure has an associated type, MCState, which gives the primitive state for that structure. For example, in the case of IORef, that state is RealWorld, whereas for STRef s, it would be s. This associated type is quite similar to the PrimState associated type from primitive, and in many type signatures you'll see an equality constraint along the lines of:

PrimState m ~ MCState c

For those who are wondering, MCState stands for "mutable container state."

All actions are part of a typeclass, which allows for generic access to different types of structures quite easily. In addition, we provide type hint functions, such as asIORef, which can help specify types when using such generic functions. For example, a common idiom might be:

ioref <- fmap asIORef $ newRef someVal

Wherever possible, we stick to well accepted naming and type signature standards. For example, note how closely modifyRef and modifyRef' match modifyIORef and modifyIORef'.

Single cell references

The base package provides both IORef and STRef as boxed mutable references, for storing a single value. The primitive package also provides MutVar, which generalizes over both of those and works for any PrimMonad instance. The MutableRef typeclass in this package abstracts over all three of those. It has two associated types: MCState for the primitive state, and RefElement to specify what is contained by the reference.

You may be wondering: why not just take the reference as a type parameter? That wouldn't allow us to have monomorphic reference types, which may be useful under some circumstances. This is a similar motivation to how the mono-traversable package works.

In addition to providing an abstraction over IORef, STRef, and MutVar, this package provides four additional single-cell mutable references. URef, SRef, and BRef all contain a 1-length mutable vector under the surface, which is unboxed, storable, and boxed, respectively. The advantage of the first two over boxed standard boxed references is that it can avoid a significant amount of allocation overhead. See the relevant Stack Overflow discussion and the benchmarks below.

While BRef doesn't give this same advantage (since the values are still boxed), it was trivial to include it along with the other two, and does actually demonstrate a performance advantage. Unlike URef and SRef, there is no restriction on the type of value it can store.

The final reference type is PRef. Unlike the other three mentioned, it doesn't use vectors at all, but instead drops down directly to a mutable bytearray to store values. This means it has slightly less overhead (no need to store the size of the vector), but also restricts the types of things that can be stored (only instances of Prim).

You should benchmark your program to determine the most efficient reference type, but generally speaking PRef will be most performant, followed by URef and SRef, and finally BRef.

Collections

Collections allow you to push and pop values to the beginning and end of themselves. Since different data structures allow different operations, each operation goes into its own typeclass, appropriately named MutablePushFront, MutablePushBack, MutablePopFront, and MutablePopBack. There is also a parent typeclass MutableCollection which provides:

  1. The CollElement associated type to indicate what kinds of values are in the collection.
  2. The newColl function to create a new, empty collection.

The mono-traversable package provides a typeclass IsSequence which abstracts over sequence-like things. In particular, it provides operations for cons, snoc, uncons, and unsnoc. Using this abstraction, we can provide an instance for all of the typeclasses listed above for any mutable reference containing an instance of IsSequence, e.g. IORef [Int] or BRef s (Seq Double).

Note that the performance of some of these combinations is terrible. In particular, pushBack or popBack on a list requires traversing the entire list, and any push operations on a Vector requires copying the entire contents of the vector. Caveat emptor! If you must use one of these structures, it's highly recommended to use Seq, which gives the best overall performance.

However, in addition to these instances, this package also provides two additional data structures: double-ended queues and doubly-linked lists. The former is based around mutable vectors, and therefore as unboxed (UDeque), storable (SDeque), and boxed (BDeque) variants. Doubly-linked lists have no such variety, and are simply DLists.

For general purpose queue-like structures, UDeque or SDeque is likely to give you best performance. As usual, benchmark your own program to be certain, and see the benchmark results below.

Benchmark results

The following benchmarks were performed on January 7, 2015, against version 0.2.0.

Ref benchmarkbenchmarking IORef time 4.322 μs (4.322 μs .. 4.323 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 4.322 μs (4.322 μs .. 4.323 μs) std dev 1.401 ns (1.114 ns .. 1.802 ns) benchmarking STRef time 4.484 μs (4.484 μs .. 4.485 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 4.484 μs (4.484 μs .. 4.484 μs) std dev 941.0 ps (748.5 ps .. 1.164 ns) benchmarking MutVar time 4.482 μs (4.482 μs .. 4.483 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 4.482 μs (4.482 μs .. 4.483 μs) std dev 843.2 ps (707.9 ps .. 1.003 ns) benchmarking URef time 2.020 μs (2.019 μs .. 2.020 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 2.020 μs (2.019 μs .. 2.020 μs) std dev 955.2 ps (592.2 ps .. 1.421 ns) benchmarking PRef time 2.015 μs (2.014 μs .. 2.015 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 2.014 μs (2.014 μs .. 2.015 μs) std dev 901.3 ps (562.8 ps .. 1.238 ns) benchmarking SRef time 2.231 μs (2.230 μs .. 2.232 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 2.231 μs (2.230 μs .. 2.231 μs) std dev 1.938 ns (1.589 ns .. 2.395 ns) benchmarking BRef time 4.279 μs (4.279 μs .. 4.279 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 4.279 μs (4.279 μs .. 4.279 μs) std dev 1.281 ns (1.016 ns .. 1.653 ns)Deque benchmarktime 8.371 ms (8.362 ms .. 8.382 ms) 1.000 R² (1.000 R² .. 1.000 R²) mean 8.386 ms (8.378 ms .. 8.398 ms) std dev 29.25 μs (20.73 μs .. 42.47 μs) benchmarking IORef (Seq Int) time 142.9 μs (142.7 μs .. 143.1 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 142.7 μs (142.6 μs .. 142.9 μs) std dev 542.8 ns (426.5 ns .. 697.0 ns) benchmarking UDeque time 107.5 μs (107.4 μs .. 107.6 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 107.5 μs (107.4 μs .. 107.6 μs) std dev 227.4 ns (171.8 ns .. 297.8 ns) benchmarking SDeque time 97.82 μs (97.76 μs .. 97.89 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 97.82 μs (97.78 μs .. 97.89 μs) std dev 169.5 ns (110.6 ns .. 274.5 ns) benchmarking BDeque time 113.5 μs (113.4 μs .. 113.6 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 113.6 μs (113.5 μs .. 113.7 μs) std dev 300.4 ns (221.8 ns .. 424.1 ns) benchmarking DList time 156.5 μs (156.3 μs .. 156.6 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 156.4 μs (156.3 μs .. 156.6 μs) std dev 389.5 ns (318.3 ns .. 502.8 ns)Test coverage

As of version 0.2.0, this package has 100% test coverage. If you look at the report yourself, you'll see some uncovered code; it's just the automatically derived Show instance needed for QuickCheck inside the test suite itself.

Categories: Offsite Blogs

How do you keep your dev machines secure?

Haskell on Reddit - Wed, 01/07/2015 - 3:32am

The Haskell development eco-system isn't exactly secure. At least for me, this conflicts with my need for security and privacy, so I'm taking additional steps do isolate my haskell dev environment from the rest of my system. Currently this means that I run cabal install only in separate VMs and won't use the resulting software outside of those VMs.

The whole thing is rather inconvenient, e.g. features integrating ghci into my editor become virtually unusable. I'm interested in how other haskellers deal with this problem. Do you take precautions or do you simply assume the network layer to be secure?

submitted by zeitlens
[link] [41 comments]
Categories: Incoming News

fpco/lts-haskell · GitHub

del.icio.us/haskell - Wed, 01/07/2015 - 2:54am
Categories: Offsite Blogs

fpco/lts-haskell · GitHub

del.icio.us/haskell - Wed, 01/07/2015 - 2:54am
Categories: Offsite Blogs

How will Typeclassopedia change in ghc-7.10?

Haskell on Reddit - Wed, 01/07/2015 - 2:36am

By "Typeclassopedia" I mean this link.

submitted by eccstartup
[link] [20 comments]
Categories: Incoming News

Why is my minmax_element implementation so slow?

Haskell on Reddit - Tue, 01/06/2015 - 3:58pm

I just completed the FP101x course on edX and I was practicing some algorithms.

I had implemented the minmax_element algorithm of the C++ STL on my own, using the same algorithm :

http://pastie.org/9815964

Then I attempted to write the same function with an a very similar interface in Haskell

http://pastie.org/9816989

Surprisingly the code is very similar, except recursion is used instead of looping.

The C++ version is about 15000 times faster when I use a vector and about 800 times faster when I use a list.

What gives?

submitted by rep_movsd
[link] [31 comments]
Categories: Incoming News

Mutation analysis for Haskell (alpha)

Haskell on Reddit - Tue, 01/06/2015 - 11:33am

Here is our MuCheck library which implements mutation analysis for Haskell.

Mutation analysis is used to evaluate quality of test suites. It does this by introducing small defects in the program, and verifying whether the test suite is able to detect the introduced defect.

A companion package is DMuCheck which allows you to parallelize the analysis either in multiple processes or across different machines using cloud-haskell.

The mutation analysis still needs to be done per function, and I still haven't implemented filtering out non-covered mutants. But it should be usable otherwise. It currently implements adapters for QuickCheck, SmallCheck, HUnit and Hspec

Any comments on better structuring it and improving it are appreciated.

submitted by blufox
[link] [9 comments]
Categories: Incoming News

Should I hold my hopes for TypeDirectedNameResolution?

Haskell on Reddit - Tue, 01/06/2015 - 10:23am

Today I learned about Type directed name resolution.

It basically let you define functions with the same name but different first argument type:

data X = ... data Y = ... null :: X -> Bool null :: Y -> Bool

which one could argue to be some kind of "explicit type classes but not really type classes".

You can see its benefits especially for container classes like Map and Set and lists, that share a lot of common member functions.

And also introduces the obj.fn notation so that any functions of type:

fn :: Type -> Rest

can be called as on any object obj of type Type as:

obj.fn

So give something like:

data Location = Location { address :: Address , city :: City } data Person = Person { location :: Person , ... }

, given person of type Person you could have:

let city = person.location.city address = person.location.address

That proposal is part of the let's fix records problem (to which there are other proposals I have yet to take a look at).

How was that discussion (started 3 years ago AFAIK) ended? Where could I find an implementation of such an extension? How likely it is that we get to solve this problem in future versions of Haskell?

Thanks for the attention.

submitted by notjefff
[link] [30 comments]
Categories: Incoming News

GHC 7.10.0-rc1 build for Mac OS X

Haskell on Reddit - Tue, 01/06/2015 - 8:46am

I made a Mac OS X build of GHC 7.10.0 RC1. It was suggested that I post it here, so here you go. You can download it here, and install it as usual (./configure --prefix=<prefix> && make install). Let me know in the comments if you run into any issues.

submitted by cameleon
[link] [2 comments]
Categories: Incoming News

Making a Blog with Yesod

Haskell on Reddit - Tue, 01/06/2015 - 8:39am
Categories: Incoming News

Philip Wadler: A contrary view to 'Antidotes to the Imitation Game'

Planet Haskell - Tue, 01/06/2015 - 8:02am
For five years, Barry Cooper has run the Alan Turing Year (ATY) mailing list, tracking worldwide activity related to the Turing Centennial and after. If you want a contrast to the view given by Antidotes to The Imitation Game, see Barry's most recent ATY post.

While I agree that fiction can sometimes can closer to truth than nonfiction, I disagree with Barry's claim that the 'higher-order' interpretation of the film accurately captures the arc of Turing's life. I suspect the real Turing differs hugely from the film's version, despite the effort and skill Tyldum, Cumberbatch, and others invested in the film.

Many computing scientists are disappointed by the divergence from history in The Imitation Game, while others think that if it does well at the Academy Awards that the popularisation of Turing will be good for our profession. There is something to be said for both points.

Hollywood's attempts at biography seems to inevitably involve gross oversimplification or distortion. Is this really necessary for a film to succeed? Does anyone have favourite examples of films that did not grossly distort their subject matter?

Categories: Offsite Blogs

I think I don't "get" netwire 5.0 types...

Haskell on Reddit - Tue, 01/06/2015 - 5:59am

Hello! I'm tying to learn the basics of FRP using netwire. However, I don't "get" what the Session type means (or what it should be). Consider the following example:

-- Let's start with the most simple wire I can imagine oneIfLessThanTen :: SimpleWire Int Int oneIfLessThanTen = mkPure_ $ \a -> if a < 10 then Right (a + 1) else Left () twoIfLessThanTwenty :: SimpleWire Int Int twoIfLessThanTwenty = mkPure_ $ \a -> if a < 20 then Right (a + 2) else Left () compositeWire :: SimpleWire Int Int compositeWire = oneIfLessThanTen <|> twoIfLessThanTwenty

So far, so good. Now, let's create a funcion to loop in the wire until it inhibits and print the value on the screen:

-- AFAIK, i need a wire, a session and the initial value. -- Since I dont know/care about the session let's just call it 's' loopWire :: SimpleWire Int Int -> s -> Int -> IO () loopWire wire session value = do (step, newSession) <- stepSession session (result, newWire) <- stepWire wire step (Right value) case result of Left _ -> putStrLn "Signal inhibited." Right v -> do putStrLn ("Value: " ++ show v) loopWire newWire newSession v

Unsurprisingly, the code doesn't compile (Couldn't match type "Identity" with "IO"). Looks like my simple wires aren't that simple anymore. My first idea was to change from SimpleWire Int Int to Wire s e m Int Int, but this time the compiler borked (Couldn't match type "e" with "()" and Couldn't match type "m" with "IO"), which made me turn my wires into Wire s () IO Int Int.

However, this was not enough (Couldn't match expected type "s" with actual type "Session IO s"). Since I couldn't even imagine what exactly the session type should be, I cheated and copied the definition from SimpleWire. This is the end result (and it's working):

oneIfLessThanTen :: Wire (Timed NominalDiffTime ()) () IO Int Int oneIfLessThanTen = ... twoIfLessThanTwenty :: Wire (Timed NominalDiffTime ()) () IO Int Int twoIfLessThanTwenty = ... compositeWire :: Wire (Timed NominalDiffTime ()) () IO Int Int compositeWire = oneIfLessThanTen <|> twoIfLessThanTwenty loopWire :: Wire (Timed NominalDiffTime ()) () IO Int Int -> Session IO (Timed NominalDiffTime ()) -> Int -> IO () loopWire wire session value = ... main :: IO () main = loopWire compositeWire clockSession_ 0

Now, I have to ask:

  1. I think I'm making this way more complicated than it should. How can I avoid such crazy type signatures?
  2. In the type Wire s e m a b, how I'm supposed to "guess" the type of s? The e, m, a and b are pretty clear to me, but I can't get what the type of s should be.
  3. This is probably related to the above questions, but I couldn't run testWire or testWireM at all (the types never matched - I'm probably lost in the function signature)
  4. Both SimpleWire and WireP looks quite limited (because of Identity). Am I missing something here?

EDIT: Syntax error when typing the code.

submitted by ibraim_gm
[link] [7 comments]
Categories: Incoming News