News aggregator

Functional Jobs: Pragmatic Haskell Developer at Anchor Systems (Full-time)

Planet Haskell - Tue, 07/29/2014 - 11:51pm

Opening for additional Haskell developers

About Anchor

Anchor Systems, based in Sydney Australia, has successfully been providing managed hosting and infrastructure to clients around the world for almost 15 years. The critical difference in the service we offer our customers often boils down to providing operations expertise and infrastructure configuration balancing the short term needs to get a new installation running with the long term imperatives of security, scalability, and flexibility.

The engineering department is building the analytics capabilities, internal systems, cloud infrastructure, deployment tooling, and operations practices that will be needed to take the company to the next level. We have every intention of changing the hosting industry to make developing applications — and operating them at scale — easier for people facing massive growth.

Haskell is the working language for all internal development; the benefits of type safety, resilience when refactoring, and stability over time added to the power of functional programming has already paid dividends. We have a significant programme of work ahead, and we're looking to add further capability as we face these challenges. Interested in helping?

Requirements

You'll need to have demonstrable experience programming in Haskell, familiarity with open source development practices, and the ability to tolerate other people refactoring your code.

Exposure to the harsh realities of IT operations and experience of the thrill of Linux systems administration will stand you in good stead. Knowledge of the history, progress, and problems in configuration management will be helpful.

A history of contributing to libraries on Hackage would be relevant. More relevant would be a track record of working well in the open (regardless of language or project). Good documentation is as important as good code (a type signature is not sufficient), and command of written English is necessary both to collaborate with your peers and to describe and promote your results.

A Bachelor's in science, engineering, or xenobiology will be well thought of. Your specific field of study certainly doesn't matter, but how you use computing to solve problems in your chosen field does. A degree in engineering will be a distinct asset. An undergraduate degree in computer science will not be held against you.

Questions?

Feel free to reach out to us if you have any questions; we're on IRC in #haskell, on Twitter, and GitHub: afcowie, pingu, thsutton, glasnt, fractalcat, tranma, oswynb, and more. Andrew, Christian, and Thomas will be at ICFP in Gothenburg so please do come say hello if you're interested.

To apply, send us a PDF copy of your CV to along with the URL to your GitHub account. We also invite you to include a pointer to something that shows you in a good light: a mailing list thread where you argued a contentious position, a piece of particularly thorny code in which you found an elegant solution to a problem, or even an essay or article you've written.

We don't do international relocations (sorry everyone who wants to work remotely from Hawaii), but we readily sponsor work visas for new hires should they need one. So whether you've just moved to Australia or have been here for generations, give us a shout next time you're in Sydney.

Get information on how to apply for this position.

Categories: Offsite Blogs

FP Complete: vectorBuilder: packed-representation yielding for conduit

Planet Haskell - Tue, 07/29/2014 - 10:00pm

This post contains fragments of active Haskell code, best viewed and executed at https://www.fpcomplete.com/blog/2014/07/vectorbuilder-packed-conduit-yielding

Back in March, I mentioned that we'd be using conduit for high performance analyses. We've been busy working on various aspects of this behind the scenes. This is the first publicly available follow-up since then.

One issue with financial analyses is bridging the gap between in-memory representations and streaming data. The former allows for higher performance for many forms of analysis, while the latter allows us to deal with far larger data sets and to generate output more quickly for certain analyses.

We'll be using the vector package almost exclusively for efficient in-memory representations in IAP, and conduit for streaming data. The question comes: what happens when we need to transition from one to the other? There are already functions like yieldMany to yield values from a Vector as a Conduit, and conduitVector or sinkVector to consume values from a Conduit into a packed representation.

While these solutions are always sufficient, they aren't always optimal. When we're going for high speed analyses, we don't want to waste time boxing unpacked values or allocating extra constructors unnecessarily. This blog post introduces a new tool we're adding to the IAP toolchain (and the conduit toolchain in general): vectorBuilder.

The problem

Often times when working with Haskell in a high performance context, the overhead introduced by a linked list representation can be too high. Having an extra constructor around each value, a constructor for each cons cell, and the indirection introduced by having to follow pointers, can completely kill performance. The most common examples of this are the high speedup you can often achieve by replacing String with Text (or sometimes ByteString), or by using Vectors- especially unboxed or storable Vectors.

conduit has a similar representation of a stream as a list, including the constructor overheads just mentioned. It's not surprising, therefore, that in a situation that a list would be a poor representation, conduits will often suffer similar performance problems. Like lists, some of this overhead is mitigated by shortcut fusion (a.k.a., rewrite rules). But this isn't always the case.

conduit-combinators provides a helper function which allows us to take back performance, by working with a packed representation instead of creating a bunch of cons cells. It does this by using the vector package's generic mutable interface under the surface, while at a user-facing level providing a simple yield-like function, avoiding the need to muck around with mutable buffers.

This article will cover how to use this function, some implementation details, and comparisons to other approaches.

NOTE: At the time of writing, the version of conduit-combinators provided on School of Haskell does not contain the vectorBuilder function, and therefore the active code below will not run.

Motivating use case

Let's start with a simple goal: we have chunks of bytes coming in, and we want to (1) duplicate each successive byte so that, e.g. [1, 2, 3] becomes [1, 1, 2, 2, 3, 3] and (2) rechunk the values into vectors of size 512. The original data could be chunked in any way, so we can rely on any specific incoming chunk size (in this case, a known 256 chunk size would be convenient).

Likely the easiest approach is to convert our stream of chunked values (e.g., ByteString or Vector Word8) into a stream of elements (e.g., Word8), duplicate the individual values, then chunk those back up. Such a solution would look like:

rechunk1 = concatC =$= concatMapC (\x -> [x, x]) =$= conduitVector 512

This uses the concatC combinator to "flatten out" the input stream, concatMapC to duplicate each individual Word8, and then conduitVector to create a stream of 512-sized Vectors. In my simple benchmark, this function took 13.06ms.

But as we can probably guess, this falls into the problem zone described in our introduction. So instead of dealing with things on the individual byte level, let's try to use some higher-level functions operating on Vectors of values instead. Our new approach will be to first mapC over the stream and use vector's concatMap to double each value, and then use takeCE and foldC to extract successive chunks of size 4096. In code:

rechunk2 = mapC (concatMap $ replicate 2) =$= loop where loop = do x <- takeCE 512 =$= foldC unless (null x) $ yield x >> loop

In the same benchmark, this performed at 8.83ms, a 32% speedup. While respectable, we can do better.

Buffer copying

Our first approach is optimal in one way: it avoids needless buffer copying. Each Word8 is copied precisely once into an output Vector by conduitVector. Unfortunately, this advantage is killed by the overhead of boxing the Word8s and allocating constructors for conduit. Our second approach avoids the boxing and constructors by always operating on Vectors, but we end up copying buffers multiple times: from the original Vector to the doubled Vector, and then when folding together multiple Vectors into a single Vector of size 512.

What we want to do is to be able to yield a Word8 and have it fill up an output buffer, and once that buffer is filled, yield that buffer downstream and start working on a new one. We could do that by directly dealing with mutable Vectors, but that's error-prone and tedious. Instead, let's introduce our new combinator function: vectorBuilder (or its unqualified name, vectorBuilderC).

The idea is simple. vectorBuilder will allocate an output buffer for you. It provides you with a special yield-like function that fills up this buffer, and when it's full, yields the entire buffer downstream for you.

To use it, we're going to use one other combinator function: mapM_CE, which performs an action for every value in a chunked input stream (in our case, for each Word8 in our input Vector Word8s). Altogether, this looks like:

rechunk3 = vectorBuilderC 512 $ \yield' -> mapM_CE (\x -> yield' x >> yield' x)

We call yield' twice to double our bytes. vectorBuilder ensures that each output buffer is of size 512. mapM_CE efficiently traverses the incoming Vectors without creating intermediate data structures.

This version benchmarks at 401.12us. This is approximately 95% faster than our previous attempt!

Avoiding transformers

There's something tricky about the yield' function above. Notice how it's not being used in the Conduit monad transformer, but is instead living the base monad (e.g., IO). This is not accidental. Not only does this allow us to use existing monadic combinators like mapM_CE, it also allows for far more efficient code. To demonstrate, let's look at two different ways of doing the same thing:

bgroup "transformers" $ let src = return () in [ bench "single" $ nfIO $ do ref <- newIORef 0 let incr = modifyIORef ref succ src $$ liftIO (replicateM_ 1000 incr) , bench "multi" $ nfIO $ do ref <- newIORef 0 let incr = liftIO $ modifyIORef ref succ src $$ replicateM_ 1000 incr ]

Both of these benchmarks use no conduit features. They both create an IORef, then increment it 1000 times. The difference is that the first calls liftIO once, while the second calls liftIO 1000 times. Let's see the difference in benchmark results:

benchmarking transformers/single mean: 4.292891 us, lb 4.285319 us, ub 4.303626 us, ci 0.950 std dev: 45.83832 ns, lb 35.04324 ns, ub 59.43617 ns, ci 0.950 benchmarking transformers/multi mean: 93.10228 us, lb 92.95708 us, ub 93.30159 us, ci 0.950 std dev: 869.6636 ns, lb 673.8342 ns, ub 1.090044 us, ci 0.950

Avoiding extra liftIO calls has a profound performance impact. The reason for this is somewhat similar to what we've been discussing up until now about extra cons cells. In our case, it's extra PipeM constructors used by conduit's MonadIO instance. I don't want to dwell on those details too much right now, as that's a whole separate topic of analysis, involving looking at GHC core output. But let's take it as a given right now.

The question is: how does vectorBuilder allow you to live in the base monad, but still yield values downstream, which requires access to the Conduit transformer? There's a trick here using mutable variables. The implementation essentially works like this:

  • Allocate a new, empty mutable vector.
  • Allocate a mutable variable holding an empty list.
  • Start running the user-supplied Conduit function, providing it with a specialized yield function.
  • The specialized yield function- which lives in the base monad- will write values into the mutable vector. Once that mutable vector is filled, the vector is frozen and added to the end of the mutable variable's list, and a new mutable vector is allocated.
  • The next time the user's function awaits for values from upstream, we jump into action. Since we're already forced to be in the Conduit transformer at that point, this is our chance to yield. We grab all of the frozen vectors from the mutable variable and yield them downstream. Once that's done, we await for new data from upstream, and provide it to the user's function.
  • When the user's function is finished, we freeze the last bit of data from the mutable vector and yield that downstream too.

The upsides of this approach are ease-of-use and performance. There is one downside you should be aware of: if you generate a large amount of output without awaiting for more data from upstream, you can begin to accumulate more memory. You can force the collection of frozen Vectors to be flushed using the following helper function:

forceFlush :: Monad m => ConduitM i o m () forceFlush = await >>= maybe (return ()) leftover

This simply awaits for a value, allowing vectorBuilder to clear its cache, and then gives the new value back as a leftover.

Overall, your goal should be to have a decent trade-off between memory and time efficiency. To demonstrate, try playing around with the functions f1, f2, and f3 in the following code snippet:

{-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE FlexibleContexts #-} import ClassyPrelude.Conduit forceFlush :: Monad m => ConduitM i o m () forceFlush = await >>= maybe (return ()) leftover -- Memory inefficient, time efficient f1 :: (Int -> IO ()) -> Sink () IO () f1 f = liftIO $ forM_ [1..1000000] f -- Memory efficient, time inefficient f2 :: (Int -> Sink () IO ()) -> Sink () IO () f2 f = forM_ [1..1000000] $ \i -> do f i forceFlush -- Good trade-off f3 f = forM_ (chunksOf 10000 [1..1000000]) $ \is -> do liftIO $ mapM_ f is forceFlush where chunksOf _ [] = [] chunksOf i x = y : chunksOf i z where (y, z) = splitAt i x main = vectorBuilderC 4096 f3 $$ (sinkNull :: Sink (Vector Int) IO ())ByteString and Vector

It may be surprising to have seen an entire article on packed representations of bytes, and not yet seen ByteString. As a matter of fact, the original use case I started working on this for had nothing to do with the vector package. However, I decided to focus on vector for two reasons:

  1. Unlike bytestring, it provides a well developed mutable interface. Not only that, but the mutable interface is optimized for storable, unboxed, and generic vectors, plus existing helper packages like hybrid-vectors. In other words, this is a far more general-purpose solution.
  2. It's trivial and highly efficient to convert a ByteString to and from a storable Vector.

To demonstrate that second point, let's try to read a file, duplicate all of its bytes as we did above, and write it back to a separate file. We'll use the toByteVector and fromByteVector functions, which I recently added to mono-traversable for just this purpose:

{-# LANGUAGE NoImplicitPrelude #-} import ClassyPrelude.Conduit import System.IO (IOMode (ReadMode, WriteMode), withBinaryFile) double :: (Word8 -> IO ()) -> Sink (SVector Word8) IO () double yield' = mapM_CE $ \w -> yield' w >> yield' w main :: IO () main = withBinaryFile "input.txt" ReadMode $ \inH -> withBinaryFile "output.txt" WriteMode $ \outH -> sourceHandle inH $$ mapC toByteVector =$ vectorBuilderC 4096 double =$ mapC fromByteVector =$ sinkHandle outHComparison with blaze-builder

There's a strong overlap between what vectorBuilder does, and how blaze-builder (and more recently, bytestring's Builder type) are intended to be used. I unfortunately can't give any conclusive comparisons between these two techniques right now. What I can say is that there are cases where using a Builder has proven to be inefficient, and vectorBuilder provides a large performance improvement. I can also say that vectorBuilder addresses many more use cases that Builder. For example, at FP Complete we're planning to use this in financial analyses for creating time series data.

On the other hand, blaze-builder and bytestring's Builder have both had far more real-world tuning than vectorBuilder. They also have support for things such as copying existing ByteStrings into the output stream, whereas vectorBuilder always works by copying a single element at a time.

So for now, if you have a use case and you're uncertain whether to use vectorBuilder to blaze-builder, I recommend either trying both approaches, or discussing it on one of the Haskell mailing lists to get more feedback.

Complete code

The code for most of the blog post above is below. Sorry that it's a bit messy:

{-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE NoMonomorphismRestriction #-} import ClassyPrelude.Conduit import Control.Monad.Primitive (PrimMonad) import Control.Monad.ST (runST) import Criterion.Main (bench, bgroup, defaultMain, nfIO, whnfIO) import qualified Data.Vector.Generic as VG import qualified System.Random.MWC as MWC import Test.Hspec (hspec, shouldBe) import Test.Hspec.QuickCheck (prop) rechunk1 :: ( Monad m , VG.Vector vector (Element input) , PrimMonad base , MonadBase base m , MonoFoldable input ) => Conduit input m (vector (Element input)) rechunk1 = concatC =$= concatMapC (\x -> [x, x]) =$= conduitVector 512 {-# INLINE rechunk1 #-} rechunk2 :: (Monad m, IsSequence a) => Conduit a m a rechunk2 = mapC (concatMap $ replicate 2) =$= loop where loop = do x <- takeCE 512 =$= foldC unless (null x) $ yield x >> loop {-# INLINE rechunk2 #-} rechunk3 :: ( MonadBase base m , PrimMonad base , MonoFoldable input , VG.Vector vector (Element input) ) => Conduit input m (vector (Element input)) rechunk3 = vectorBuilderC 512 $ \yield' -> mapM_CE (\x -> yield' x >> yield' x) {-# INLINE rechunk3 #-} main :: IO () main = do hspec $ prop "rechunking" $ \ws -> do let src = yield (pack ws :: UVector Word8) doubled = concatMap (\w -> [w, w]) ws res1 = runST $ src $$ rechunk1 =$ sinkList res2 = runST $ src $$ rechunk2 =$ sinkList res3 = runST $ src $$ rechunk3 =$ sinkList res1 `shouldBe` (res2 :: [UVector Word8]) (res3 :: [UVector Word8]) `shouldBe` (res2 :: [UVector Word8]) (unpack $ (mconcat res2 :: UVector Word8)) `shouldBe` (doubled :: [Word8]) case reverse res1 :: [UVector Word8] of [] -> return () x:xs -> do (length x <= 512) `shouldBe` True all ((== 512) . length) xs `shouldBe` True gen <- MWC.createSystemRandom bytes <- replicateM 20 $ MWC.uniformR (12, 1024) gen >>= MWC.uniformVector gen defaultMain [ bgroup "copy bytes" [ bench "rechunk1" $ whnfIO $ yieldMany (bytes :: [UVector Word8]) $$ (rechunk1 :: Conduit (UVector Word8) IO (UVector Word8)) =$ sinkNull , bench "rechunk2" $ whnfIO $ yieldMany (bytes :: [UVector Word8]) $$ (rechunk2 :: Conduit (UVector Word8) IO (UVector Word8)) =$ sinkNull , bench "rechunk3" $ whnfIO $ yieldMany (bytes :: [UVector Word8]) $$ (rechunk3 :: Conduit (UVector Word8) IO (UVector Word8)) =$ sinkNull ] , bgroup "transformers" $ let src = return () in [ bench "single" $ nfIO $ do ref <- newIORef (0 :: Int) let incr = modifyIORef ref succ src $$ liftIO (replicateM_ 1000 incr) , bench "multi" $ nfIO $ do ref <- newIORef (0 :: Int) let incr = liftIO $ modifyIORef ref succ src $$ replicateM_ 1000 incr ] ]
Categories: Offsite Blogs

wren gayle romano: "Is it bad to look for signs from the past?"

Planet Haskell - Tue, 07/29/2014 - 6:14pm

Someone asked recently whether it's bad to seek "signs" of being trans from the past, and why or why not. This question is one which deserves to be more widely circulated. Within trans circles a fair number of people have an understanding of the situation and it's complexity, but it's something I think non-trans circles should also be aware of— especially given the recent publicity surrounding trans lives.

The problems are twofold:

A lot of people look for signs because they're seeking some sort of validation. The problem here is that you end up misinterpreting and overanalyzing your own life in search of that validation. It's not that the past cannot provide validation for your present, it's just missing the point. What we want (more often than not) is acceptance of who we are now and recognition for our current experience. There's more to current identities, pains, and experiences than the past that gave rise to them, so validation can come from sources other than the past. Moreover, it's all too easy for people to "validate" your past while simultaneously invalidating your present, so validation from the past is not stable. Altogether, none of this is trans-specific: it's a general problem with seeking retrospective validation; and it also applies to people who've suffered abuse, experience mental illness, have changed careers, etc.

The second problem is that, in overanalyzing our pasts in search of validation, we all too often end up reinscribing "standard" trans narratives. If our pasts do not fit the "standard" narrative then we will not find the validation we seek, thus we will call our current understanding even further into question, and this sense of invalidation will only make us feel worse. If our pasts only partially fit the "standard" narrative then, in search of validation, we will highlight those memories and background the others; thus denying ourselves the full actualization of our personal history, and invalidating at least in part who we are. And if our pasts (somehow) completely fit the "standard" narrative then, in holding that history up as "proof" of our legitimacy, we end up marginalizing and invalidating everyone with different narratives. Again, this isn't a trans-specific problem (cf., "standard" narratives of gay lives or depression prior to, say, the 1970s.); though it's especially problematic for trans people because of the dearth of public awareness that our narrative tapestries are as rich and varied as cis narrative tapestries.

There's nothing wrong with seeking support for your current self from your past memories. Doing so is, imo, crucial in coming to understand, respect, and take pride in our selves. The problems of retrospection are all in the mindset with which it is pursued. We shouldn't rely on "born this way" narratives in order to justify the fact that, however we were born, we are here now and in virtue of our presence alone are worthy of respect and validation.

Fwiw, I do very much value my "signs", and often share them as amusing anecdotes— both to foster understanding, and to destabilize people's preconceived notions. But I do not seek validation in these signs; they're just collateral: symptoms of, not support for, who I am.

Twitter Facebook Google+ Tumblr

comments
Categories: Offsite Blogs

Danny Gratzer: Many Shades of Halting Oracles

Planet Haskell - Tue, 07/29/2014 - 6:00pm
Posted on July 30, 2014

I’m going to a take a quick break from arguing with people on the internet to talk about a common point of confusion with theorem provers.

People will often state things like “A program in Coq never diverges” or that “we must prove that X halts”. To an outsider, that sounds impossible! After all, isn’t the halting problem undecidable?

Now the thing to realize is that while yes the halting problem is undecidable, we’re not solving it. The halting problem essentially states

For an arbitrary turing machine P. There is no algorithm guaranteed to terminate that will return true if P halts and false if it diverges.

In theorem provers, we cleverly avoid this road block with two simple tricks. I’m going to discuss these in the context of Coq but these ideas generalize between most theorem provers.

Being Negative

A program in Coq must halt. To do otherwise would introduce a logical inconsistency. So to enforce this we need to statically decide whether some program halts.

We just said that this is impossible though! To escape this paradox Coq opts for a simple idea: reject good programs.

Rather than guaranteeing to return true for every good program, we state that we’ll definitely reject all bad programs and then some.

For example, this termination checker would be logically consistent

terminates :: CoqProgram -> Bool terminates _ = False

It’d be useless of course, but consistent. Coq therefore accepts a certain set of programs which are known to terminate. For example, ones that limit themselves only to guarded coinduction or structural induction.

Getting Our Hands Dirty

While it may be impossible to decide the termination of an arbitrary program, it’s certainly possible to prove the termination of a specific program.

When Coq’s heuristics fail, we can always resort to manually proving that our code will terminate. This may not be pleasant, but it’s certainly doable. By lifting the burden of Coq, we go from “constructing arbitrary proof of termination” to “checking arbitrary proof of termination”, which is decidable.

In Coq we can do this will well founded recursion. Simply put, well founded recursion means that we shift from using only term “size” to decide what’s a smaller recursive call to any nice binary relation. If you’re not interested in Coq specifically, you can check out your preferred proof assistants formalization of well founded recursion.

To this end, we define a relation for some type A : Set, R : A -> A -> Prop. Read R x y as x is smaller than y.

Now we must show that this relation preserves some definition of “sanity”. This should mean that if when a function receives x, for any y so that R y x, we should be able to recurse on y. This should also mean that there’s no infinite stack of terms so that R x y, R z x, R w z …. because this would mean we could recurse infinitely. To capture this idea, we must prove well_founded A R. What’s this “well founded” thing you say?

Well it’s just

Definition well_founded A R := forall a : A, Acc R a

This Acc thing means “accessible”,

Inductive Acc (A : Type) (R : A -> A -> Prop) (x : A) : Prop := Acc_intro : (forall y : A, R y x -> Acc R y) -> Acc R x

So something is accessible in R if everything less than it is also accessible.

We can easily prove that if R is well_founded there is no infinite chain that could lead us to infinite recursion.

Section founded. Variable A : Set. Variable R : A -> A -> Prop. Variable well_founded : well_founded R. CoInductive stream := | Cons : A -> stream -> stream. CoInductive tower_of_bad : stream -> Prop := OnTop : forall x y rest, R y x -> tower_of_bad (Cons y rest) -> tower_of_bad (Cons x (Cons y rest)). Lemma never_on_top : forall x, forall rest, ~ tower_of_bad (Cons x rest). intro; induction (well_founded x); inversion 1; try subst; match goal with [H : context[~ _] |- _ ] => eapply H; eauto end. Qed. Theorem no_chains : forall xs, ~ tower_of_bad xs. destruct 1; eapply never_on_top; eauto. Qed. End founded.

We’re using a powerful trick in never_on_top, we’re inducting upon Acc! This is the key to using well founded recursion. By inducting upon the Acc instead of one of the terms of our function, we can easily recurse on any subterm y, if R y x.

This is handed to us by the lovely Fix (uppercase).

Fix : well_founded R -> forall P : A -> Type, (forall x : A, (forall y : A, R y x -> P y) -> P x) -> forall x : A, P x

So Fix is the better, cooler version of structural recursion that we were after. It lets us recurse on any y where R y x.

So in some sense, you can view Coq’s Fixpoint as just a specialization of Fix where R x y means that x is a subterm of y.

Wrap Up

So in conclusion, theorem provers don’t do the impossible. Rather they have a small battery of tricks to cheat the impossible general case and simplify common cases.

Back to the internet I go.

<script type="text/javascript"> /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */ var disqus_shortname = 'codeco'; // required: replace example with your forum shortname /* * * DON'T EDIT BELOW THIS LINE * * */ (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js'; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript> comments powered by Disqus
Categories: Offsite Blogs

RocksDB Haskell bindings

Haskell on Reddit - Tue, 07/29/2014 - 5:18pm
Categories: Incoming News

ETAPS 2015 1st call for papers

General haskell list - Tue, 07/29/2014 - 4:22pm
****************************************************************** CALL FOR PAPERS: ETAPS 2015 18th European Joint Conferences on Theory And Practice of Software London, UK, 11-18 April 2015 http://www.etaps.org/2015 ******************************************************************
Categories: Incoming News

Fast 2D monadic loop

Haskell on Reddit - Tue, 07/29/2014 - 3:34pm

A common pattern for me is iterating over and writing to every slot in a 2D array stored in a vector (having the 2D coordinate is a requirement, so no simple V.map etc.). One questions for me is always, what's the fastest way to write such a loop? Since I just came across this very pattern again I thought I'll quickly do some measurements and also give the newly released 'loop' package a try:

-- 9.6 10.3 FPS forM_ [(x, y) | y <- [0..h - 1], x <- [0..w - 1]] $ \(px, py) -> -- 2.7 10.7 FPS numLoop 0 (h - 1) $ \py -> numLoop 0 (w - 1) $ \px -> -- 9.6 11.0 FPS forLoop 0 (< h) (+1) $ \py -> forLoop 0 (< w) (+1) $ \px -> -- 10.7 10.8 FPS forM_ [0..h - 1] $ \py -> forM_ [0..w - 1] $ \px -> -- 9.4 10.3 FPS -- from https://github.com/Twinside/Juicy.Pixels/blob/395f9cd18ac936f769fc63781f67c076637cf7aa/src/Codec/Picture/Jpg/Common.hs#L179 rasterMap w h $ \px py ->

Sorry for the stupid 'frames per second' performance numbers, this loop has a quickly thrown together Mandelbrot fractal for its body and loops over a texture image buffer to be uploaded to OpenGL for display, so that's the number I got at hand. The first number is running in 'StateT ReaderT IO', the second is for plain IO. There clearly there is a difference for these loops even with a non-trivial payload in the body. The 'numLoop' one is rather strange, seems to be way more affected by the transformer overhead than the other solutions. This kind of code tends to perform a lot better with -fllvm, so the fastest way might depend on the backend.

Is there some accepted known best way of doing this?

EDIT:

Since all the replies so far have been about how to allocate a tightly packed vector, how to index it, how to traverse memory in cache friendly order etc., I just want to state that this is absolutely clear to me already. I know all that and am already doing it. I'm really asking just about the looping construct, like in the four examples I gave. It seems there's quite a performance discrepancy between the different approaches, prompting packages like loop (hackage.haskell.org/package/loop) to be written.

submitted by SirRockALot1
[link] [27 comments]
Categories: Incoming News

Roman Cheplyaka: Beware of bracket

Planet Haskell - Tue, 07/29/2014 - 3:00pm

Many Haskellers know and love the bracket function:

bracket :: IO a -> (a -> IO b) -> (a -> IO c) -> IO c

In our volatile world bracket provides a safe haven of certainty: the resource will be released, no matter what. However, there’s a catch. (Nice pun, huh?)

This article describes two observations about bracket that are useful to keep in mind when writing concurrent Haskell code.

Unfulfilled promise

Is the resource you’re trying to release something internal to your code, or is observable outside?

In the former case, you’re fine. An example would be taking an MVar and putting it back afterwards:

withMVar :: MVar a -> (a -> IO b) -> IO b withMVar m io = bracket (takeMVar m) (putMVar m) io

Now consider the case when the resource is something tangible — let’s say, a directory.

withFoo :: IO a -> IO a withFoo io = bracket_ (createDirectory "foo") (removeDirectoryRecursive "foo") io

You might think that there’s no way for foo to remain in the filesystem after the program is done, barring OS-level issues.

In reality, you only get this guarantee if withFoo is executed in the main Haskell thread (i.e. the thread that executes main). When the main thread finishes, the program finishes, too, without sending exceptions to other threads and giving them any chance to finalize.

This limitation is especially acute in library code, where you don’t know whether you’re running in the main thread or have any way to tie to it.

Inadvertent mask

You could try to write something like

main = do bracket (forkIO myComputation) killThread $ \_ -> do ...

The idea here is that if main exits, for whatever reason, you’ll send an exception to the thread you forked, and give it a chance to clean up.

First, this isn’t going to help much because main will exit right after killThread, probably right in the middle of myComputation’s cleanup process. Some kind of synchronisation should be introduced to address this properly. (The price is that your program may not exit promptly when you interrupt it with Ctrl-C.)

There’s another, more subtle issue with the code above. Let’s look at the definition of bracket:

bracket before after thing = mask $ \restore -> do a <- before r <- restore (thing a) `onException` after a _ <- after a return r

As you see, the before action is run in the masked state. Forked threads inherit the masked state of their parents, so myComputation and all threads spawned by it will unwittingly run in the masked state, unless they do forkIOWithUnmask.

In this simple case, you should just use withAsync from the async package. What about more complex ones?

If you do forking explicitly, then you can write bracket-like code yourself and restore the forked computation. Here’s an example of synchronised cleanup:

main = do cleanupFlag <- atomically $ newTVar False mask $ \restore -> do pid <- forkIO $ restore $ myComputation cleanupFlag restore restOfMain `finally` do killThread pid -- wait until myComputation finishes its cleanup -- and sets the flag to True atomically $ readTVar cleanupFlag >>= check

(You could use an MVar for synchronisation just as well.)

And what if forking happens inside some library function that you need to call? In that case, you may want to restore that whole function from the beginning.

Categories: Offsite Blogs

Does every monad transformer t have a map t (t I) a -> t I a?

Haskell on Reddit - Tue, 07/29/2014 - 10:49am

Does every monad transformer t have a map t (t I) a -> t I a?

I'm using I as an abbreviation for Identity.

I've checked this is indeed the case for MaybeT, EitherT/ErrorT, WriterT, StateT, ContT. Is there a sensible monad transformer that doesn't support this transformation?

(NB: I previously submitted this with an incorrect type signature so I resubmitted)

submitted by tomejaguar
[link] [10 comments]
Categories: Incoming News

Overlapping and incoherent instances

glasgow-user - Tue, 07/29/2014 - 10:11am
Friends One of GHC's more widely-used features is overlapping (and sometimes incoherent) instances. The user-manual documentation is here<http://www.haskell.org/ghc/docs/latest/html/users_guide/type-class-extensions.html#instance-overlap>. The use of overlapping/incoherent instances is controlled by LANGUAGE pragmas: OverlappingInstances and IncoherentInstances respectively. However the overlap/incoherent-ness is a property of the *instance declaration* itself, and has been for a long time. Using LANGUAGE OverlappingInstances simply sets the "I am an overlapping instance" flag for every instance declaration in that module. This is a Big Hammer. It give no clue about *which* particular instances the programmer is expecting to be overlapped, nor which are doing the overlapping. It brutally applies to every instance in the module. Moreover, when looking at an instance declaration, there is no nearby clue that it might be overlapped. The clue might be in the command line that compiles that module! Iavor
Categories: Offsite Discussion

Deduce issue.

haskell-cafe - Tue, 07/29/2014 - 7:29am
Hi, I have code like follow. And compiling gave me error as the last part. What should I do? ``` class WidgetClass w data Widget = forall a. WidgetClass a => Widget a instance WidgetClass Widget lookupWin :: (WidgetClass w) => WINDOW -> w -> Maybe w lookupWin xwin widget = if widgetWindowId widget == xwin then Just widget else foldl (\a (Widget b) -> maybe (lookupWin xwin b) Just a ) Nothing $ widgetChildren widget ``` --- ``` Graphics/HUI/Widget.hs:15:23: Could not deduce (w ~ a) from the context (WidgetClass w) bound by the type signature for lookupWin :: WidgetClass w => WINDOW -> w -> Maybe w at Graphics/HUI/Widget.hs:10:14-54 or from (WidgetClass a) bound by a pattern with constructor Widget :: forall a. WidgetClass a => a -> Widget, in a lambda abstraction at Graphics/HUI/Widget.hs:14:19-26 ‘w’ is a rigid type variable bound by the type signature for
Categories: Offsite Discussion

How to update a cabal dependency automatically

haskell-cafe - Tue, 07/29/2014 - 6:52am
I have been working on a haskell scotty project recently. I have just added configuration support for the project, which added an extra dependency of Control.Lens. My question is: is there a way update .cabal file automatically, just like cabal init but without recreate .cabal file? Thanks in advance. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe< at >haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Categories: Offsite Discussion

Is there a Haskell implementation of the Lindley-Wadler-Yallop arrow calculus?

Haskell on Reddit - Tue, 07/29/2014 - 5:19am

Lindley, Wadler, and Yallop published The Arrow Calculus way back in 2009, and though this work appears to significantly advance reasoning about and expression of computation with arrows beyond Paterson's refinements (represented in the current Control.Arrow module) on Hughes' foundational work, I can't find any Haskell implementations of it. Does such code exist?

submitted by ribald_eagle
[link] [7 comments]
Categories: Incoming News

early adopters rejoyce! Haskell Platform 2014.2.0.0Release Candidate 2

haskell-cafe - Tue, 07/29/2014 - 4:31am
The long anticipated Haskell Platform 2014.2 release, including GHC 7.8.3, and numerous updated packages, is almost here! We have created "Release Candidate 2" installers for OS X and Windows, and believe, barring show stopper issues, creepers exploding, or unexpected trips to the nether, these will likely be blessed as the final by the end of the week. If you would like to be an early adopter, please try 'em out... - source tarball: haskell-platform-2014.2.0.0-RC2.tar.gz <http://www.ozonehouse.com/mark/platform/haskell-platform-2014.2.0.0-RC2.tar.gz> - source repo: haskell/haskell-platform at 2014.2.0.0-RC2 <https://github.com/haskell/haskell-platform/tree/2014.2.0.0-RC2> - windows 32bit: hskellPlatform-2014.2.0.0-i386-RC2-setup.exe <http://www.ozonehouse.com/mark/platform/hskellPlatform-2014.2.0.0-i386-RC2-setup.exe> - windows 64bit:hskellPlatform-2014.2.0.0-x86_64-RC2-setup.exe <http://www.ozonehouse.com/mark/platform/hskellPlatform-2014.2.0.0-x86_64-RC2-setup.exe> - os x 64b
Categories: Offsite Discussion