Saturday night I had a fainting spell. Sunday my eyes were burning, I was feverish, weak, and had the beginnings of a migraine. Monday was completely lost in the blaze of a migraine. Tuesday I was starting to feel better— and then, nope; threw up that night. Wednesday I awoke with what felt like a raging sinus infection; spent the whole day in a haze of sudafed and ibuprofen, and went through literally an entire box of tissues.
Starting to feel a little better this morning, so I figure this is the end. It was a nice life. Y'might want to bar your doors today, just in case it's locusts.
Junfeng Yang, Heming Cui, Jingyue Wu, Yang Tang, and Gang Hu, "Determinism Is Not Enough: Making Parallel Programs Reliable with Stable Multithreading", Communications of the ACM, Vol. 57 No. 3, Pages 58-69.
We believe what makes multithreading hard is rather quantitative: multithreaded programs have too many schedules. The number of schedules for each input is already enormous because the parallel threads may interleave in many ways, depending on such factors as hardware timing and operating system scheduling. Aggregated over all inputs, the number is even greater. Finding a few schedules that trigger concurrency errors out of all enormously many schedules (so developers can prevent them) is like finding needles in a haystack. Although Deterministic Multi-Threading reduces schedules for each input, it may map each input to a different schedule, so the total set of schedules for all inputs remains enormous.
We attacked this root cause by asking: are all the enormously many schedules necessary? Our study reveals that many real-world programs can use a small set of schedules to efficiently process a wide range of inputs. Leveraging this insight, we envision a new approach we call stable multithreading (StableMT) that reuses each schedule on a wide range of inputs, mapping all inputs to a dramatically reduced set of schedules. By vastly shrinking the haystack, it makes the needles much easier to find. By mapping many inputs to the same schedule, it stabilizes program behaviors against small input perturbations.
The link above is to a publicly available pre-print of the article that appeared in the most recent CACM. The CACM article is a summary of work by Junfeng Yang's research group. Additional papers related to this research can be found at http://www.cs.columbia.edu/~junfeng/
The wiki page on monad performance claims that mtl monad transformers are quite inefficient. Unfortunately, when I'm writing state-heavy Haskell code, I typically have a stack of Writer, State, Reader, ErrorT, etc. I'm trying to benchmark some relatively idiomatic Haskell vs a C++ implementation, and I want to write Haskell code which is as competitive as possible. However, the alternatives shown on that wiki page look awful to me. I'd rather take a performance hit than write code I won't understand in a few months, or that I have to overhaul every time I change my monad stack.
Are these claims about monad transformer performance still valid? If so, are there any alternatives which still allow for relatively readable code? I'm not trying to match the performance of my C++ implementation, but I want it to at least be competitive. I'm already reading up on unboxing, strictness annotations, etc, it's just this higher-level stuff which has so far eluded me.
Thanks!submitted by trolox
[link] [21 comments]
For some upcoming improvements to FP Haskell Center, I recently added a new feature to Stackage: the ability to detect module name conflicts. This is where two different packages both export a module of the same name.
You can see the full module name conflict list for my most recent build. The file format is fairly dumb: one line lists all of the packages using a common module name, and the following line contains all of the module names shared. (JSON, YAML, or CSV would have been better file formats for this, but one of the goals of the Stackage codebase is to avoid extra package dependencies wherever possible.)
Most of these conflicts don't seem problematic at all. The fact that base, haskell98, haskell2010, and base-compat share a lot of the same module names, for example, should be expected, and users really do need to choose just one of those packages to depend on.
Some other cases, on the other hand, might cause issues. For example, both hashmap and unordered-containers export the Data.HashSet module. This can negatively affect users of GHCi who have both packages installed and try to import Data.HashSet. Also, if for some reason a cabal package depended on both, you'd need to use package imports to disambiguate. There can also be an issue of confusion: if I see Data.HashSet at the top of a module, it would be nice to know which package it comes from without having to check a cabal file or running ghc-pkg.
I'm mostly writing this blog post as I think it's the first time we've had any kind of collection of this information, and I don't think we've had a community discussion about conflicting module names. I don't know if the problem is significant enough to even warrant further analysis, or how have thoughts on how to proceed if we do want to try and disambiguate module names.
Here are some of the conflicting module names, and the packages using them: