News aggregator

FP Complete: Seeking a Haskell developer for high performance, distributed computing

Planet Haskell - Wed, 11/12/2014 - 6:00pm

You know Haskell is an outstanding language for parallel and concurrent programming, and you know how fast Haskell code can be. FP Complete, the leading Haskell company, is working on a powerful, scalable medical application. We need a senior Haskell developer to work on high performance computing and related projects.

You’ll create a general purpose set of libraries and tools for performing big distributed computations that run fast and are robust, and general-purpose tools to monitor such programs at runtime and both profile and debug them. Using your work we’ll perform very large volumes of computation per CPU, across multiple cores and across multiple servers.

This is a telecommute position which can be filled from anywhere, with some preference given to applicants in North America and especially the San Francisco and San Diego areas. You should have these skills:

  • overall strong Haskell application coding ability
  • experience building reusable components/packages/libraries,
  • experience writing high-throughput computations in math, science, finance, engineering, big data, graphics, or other performance-intensive domains,
  • a solid understanding of distributed computing,
  • an understanding of how to achieve low-latency network communication,
  • knowledge of how to profile Haskell applications effectively,
  • work sharing or distributed computing algorithms,
  • multicore/parallel application development.

These further skills are a plus:

  • experience building scalable server/Web applications,
  • experience tuning GHC’s runtime options for high performance,
  • some experience working on GHC internals,
  • knowledge of Cloud Haskell,
  • knowledge of Threadscope or similar profiling tools,
  • an understanding of the implementation of GHC’s multithreaded runtime,
  • experience as a technical lead and/or a manager,
  • experience as an architect and/or a creator of technical specs.

In addition to these position-specific skills, candidates are expected to have clear written and verbal communication in English, the ability to work well within a distributed team and experience with typical project tools (Git or similar revision control system, issue tracker, etc).

Please submit your application to jobs+dev@fpcomplete.com. A real member of our engineering team will read it. In addition to a resume/CV, we very much like to see any open source work that you can point to in the relevant domains, or comparable source code you can show us.

Categories: Offsite Blogs

Proposal: Add 'fillBytes' to Foreign.Marshal.Utils

libraries list - Wed, 11/12/2014 - 4:21pm
Hi everyone, Currently the memory allocated with (Foreign.Marshal.Alloc) malloc may potentially be dirty. In C this problem is usually solved by using memset. This would be extremely useful for FFI / C interop, when a data structure is allocated within Haskell code. With memset, you can do something like customMem <- malloc _ <- memset (castPtr customMem) 0 #{size custom_t} This will fill a block of allocated memory with zeroes. For example, I've been working on FFI bindings for collectd, and when I was allocating memory, previously used within the process, it was dirty, so my CString contained a value of "custom name7e0700490, test_plugin_LTX_module_register): symbol nd" Instead of just "custom name" After using memset and filling everything with zeroes, problem never appeared again. This can be implemented in user applications, too, although it would be really nice to have it by default. This is a common practice in C world, and this function is not only useful for cases when the memory was
Categories: Offsite Discussion

Presenting at Royal Holloway Colloquium

haskell-cafe - Wed, 11/12/2014 - 2:41pm
All, I have been invited to give a TED style talk (20 mins) at the Royal Holloway Hewlett Packard Information Security Colloquium: https://www.royalholloway.ac.uk/isg/externalengagement/hpday.aspx. Now I could give an uncontroversial talk about Internet banking security using triple DES, role based access control, etc. but I am thinking about being controversial (I think that is in the spirit of TED). I’d like to say that the Information Security community is solving the wrong problems by e.g. performing security audits of code, developing tools for finding buffer overflows, etc. and what they should really be doing is encouraging development in languages that prevent this sort of behaviour. E.g. if openssl were written in Haskell, heartbleed (http://en.wikipedia.org/wiki/Heartbleed) would never have happened. What do people think about this? Are there other examples I can draw on? Dominic Steinitz dominic< at >steinitz.org http://idontgetoutmuch.wordpress.com
Categories: Offsite Discussion

data analysis question

haskell-cafe - Wed, 11/12/2014 - 11:45am
Hi, just the other day I talked to a friend of mine who works for an online radio service who told me he was currently looking into how best work with assorted usage data: currently 250 million entries as a 12GB in a csv comprising of information such as which channel was tuned in for how long with which user agent and what not. He accidentally ran into K and Q programming language (*1) which apparently work nicely for this as unfamiliar as it might seem. This certainly is not my area of expertise at all. I was just wondering how some of you would suggest to approach this with Haskell. How would you most efficiently parse such data evaluating custom queries ? Thanks for your time, Tobi [1] (http://en.wikipedia.org/wiki/K_(programming_language) [2] http://en.wikipedia.org/wiki/Q_(programming_language_from_Kx_Systems)
Categories: Offsite Discussion

PROPOSAL: toBoundedIntegral: an adaptation of fromIntegral that respects bounds

libraries list - Wed, 11/12/2014 - 9:14am
Inspired by conversations recent [1] and not-so-recent [2] and by my own past wish for this, I propose adding the following function to base: toBoundedIntegral :: (Integral a, Integral b, Bounded b) => a -> Maybe b toBoundedIntegral x | y > toInteger (maxBound `asTypeOf` z) = Nothing | y < toInteger (minBound `asTypeOf` z) = Nothing | otherwise = Just $! z where y = toInteger x z = fromInteger y This includes rules to optimize for many cases where we know the bounds at compile time. See the gist for the full implementation: https://gist.github.com/spl/1986c9ff0b2416948957 I'm not particularly concerned with the precise location of this function. It could start out in GHC.Int if there's controversy. Otherwise, Data.Int would be an option to avoid polluting the Prelude. I chose the name to be descriptive of the result type, which is the more constrained type. The function that I started from [2] was called fromIntegralSafe, but I think “safe” is too ambigu
Categories: Offsite Discussion

www.quora.com

del.icio.us/haskell - Wed, 11/12/2014 - 8:48am
Categories: Offsite Blogs

Yesod Web Framework: Installing application dependencies using Stackage, sandboxes, and freezing

Planet Haskell - Wed, 11/12/2014 - 5:38am

Installing Haskell packages is still a pain. But I believe the community has some good enough workarounds that puts Haskell on par with a lot of other programming languages. The problem is mostly that the tools and techniques are newer, do not always integrate easily, and are still lacking some automation.

My strategy for successful installation:

  • Install through Stackage
  • Use a sandbox when you start having complexities
  • freeze (application) dependencies

Simple definitions:

  • Stackage is Stable Hackage: a curated list of packages that are guaranteed to work together
  • A sandbox is a project-local package installation
  • Freezing is specifying exact dependency versions.

I really hope that Stackage (and sandboxes to a certain extent) are temporary workarounds before we have an amazing installation system such as backpack. But right now, I think this is the best general-purpose solution we have. There are other tools that you can use if you are not on Windows:

  • hsenv (instead of sandboxes)
  • nix (instead of Stackage and sandboxes)

hsenv has been a great tool that I have used in the past, but I personally don't think that sandboxing at the shell level with hsenv is the best choice architecturally. I don't want to have a sandbox name on my command line to remind me that it is working correctly, I just want cabal to handle sandboxes automatically.

Using Stackage

See the Stackage documentation. You just need to change the remote-repo setting in your ~/.cabal/config file.

Stackage is a curated list of packages that are guaranteed to work together. Stackage solves dependency hell with exclusive and inclusive package snapshots, but it cannot be used on every project.

Stackage offers 2 package lists: exclusive, and inclusive. Exclusive includes only packages vetted by Stackage. Exclusive will always work, even for global installations. This has the nice effect of speeding up installation and keeping your disk usage low, whereas if you default to using sandboxes and you are making minor fixes to libraries you can end up with huge disk usage. However, you may eventually need packages not on Stackage, at which point you will need to use the inclusive snapshot. At some point you will be dealing with conflicts between projects, and then you definitely need to start using sandboxes. The biggest problem with Stackage is that you may need a newer version of a package than what is on the exclusive list. At that point you definitely need to stop using Stackage and start using a sandbox.

If you think a project has complex dependencies, which probably includes most applications in a team work setting, you will probably want to start with a sandbox.

Sandboxescabal sandbox init

A sandbox is a project-local package installation. It solves the problem of installation conflicts with other projects (either actively over-writing each-other or passively sabotaging install problems). However, the biggest problem with sandboxes is that unlike Stackage exclusive, you still have no guarantee that cabal will be able to figure out how to install your dependencies.

sandboxes are mostly orthogonal to Stackage. If you can use Stackage exclusive, you should, and if you never did a cabal update, you would have no need for a sandbox with Stackage exclusive. When I am making minor library patches, I try to just use my global package database with Stackage to avoid bloating disk usage from redundant installs.

So even with Stackage we are going to end up wanting to create sandboxes. But we would still like to use Stackage in our sandbox: this will give us the highest probability of a successful install. Unfortunately, Stackage (remote-repo) integration does not work for a sandbox.

The good news is that there is a patch for Cabal that has already been merged (but not yet released). Even better news is that you can use Stackage with a sandbox today! Cabal recognizes a cabal.config file which specifies a list of constraints that must be met, and we can set that to use Stackage.

cabal sandbox init curl http://www.stackage.org/alias/fpcomplete/unstable-ghc78-exclusive/cabal.config > cabal.config cabal install --only-depFreezing

There is a problem with our wonderful setup: what happens when our package is installed on another location? If we are developing a library, we need to figure out how to make it work everywhere, so this is not as much of an issue.

Application builders on the other hand need to produce reliable, re-producible builds to guarantee correct application behavior. Haskellers have attempted to do this in the .cabal file by pegging versions. But .cabal file versioning is meant for library authors to specify maximum version ranges that a library author hopes will work with their package. Pegging packages to specific versions in a .cabal file will eventually fail because there are dependencies of dependencies that are not listed in the .cabal file and thus not pegged. The previous section's usage of a cabal.config has a similar issue since only packages from Stackage are pegged, but Hackage packages are not.

The solution to this is to freeze your dependencies:

cabal freeze

This writes out a new cabal.config (overwriting any existing cabal.config). Checking in this cabal.config file guarantees that everyone on your team will be able to reproduce the exact same build of Haskell dependencies. That gets us into upgrade issues that will be discussed.

It is also worth noting that there is still a rare situation in which freezing won't work properly because packages can be edited on Hackage.

Installation workflow

Lets go over an installation workflow:

cabal sandbox init curl http://www.stackage.org/alias/fpcomplete/unstable-ghc78-exclusive/cabal.config > cabal.config cabal install --only-dep

An application developer will then want to freeze their dependencies.

cabal freeze git add cabal.config git commit cabal.configUpgrading packages

cabal-install should provide us with a cabal upgrade [PACKAGE-VERSION] command. That would perform an upgrade of the package to the version specified, but also perform a conservative upgrade of any transitive dependencies of that package. Unfortunately, we have to do upgrades manually.

One option for upgrading is to just wipe out your cabal.config and do a fresh re-install.

rm cabal.config rm -r .cabal-sandbox cabal sandbox init curl http://www.stackage.org/alias/fpcomplete/unstable-ghc78-exclusive/cabal.config > cabal.config cabal update cabal install --only-dep cabal freeze

With this approach all your dependencies can change so you need to re-test your entire application. So to make this more efficient you are probably going to want to think about upgrading more dependencies than what you originally had in mind to avoid doing this process again a week from now.

The other extreme is to become the solver. Manually tinker with the cabal.config until you figure out the upgrade plan that cabal install --only-dep will accept. In between, you can attempt to leverage the fact that cabal already tries to perform conservative upgrades once you have packages installed.

rm cabal.config curl http://www.stackage.org/alias/fpcomplete/unstable-ghc78-exclusive/cabal.config > cabal.config cabal update cabal install --only-dep --force-reinstalls cabal freeze

You can make a first attempt without the --force-reinstalls flag, but the flag is likely to be necessary.

If you can no longer use Stackage because you need newer versions of the exclusive packages, then your workflow will be the same as above without the curl step. But you will have a greater desire to manually tinker with the cabal.config file. This process usually consists mostly of deleting constraints or changing them to be a lower bound.

Conclusion

Upgrading packages is still a horrible experience.

However, for a fresh install, using Stackage, sandboxes, and freezing works amazingly well. Of course, once you are unable to use Stackage because you need different exclusive versions you will encounter installation troubles. But if you originally started based off of Stackage and try to perform conservative upgrades, you may still find your situation easier to navigate because you have already greatly reduced the search space for cabal. And if you are freezing versions and checking in the cabal.config, the great thing is that you can experiment with installing new dependencies but can always revert back to the last known working dependencies.

Using these techniques I am able to get cabal to reliably install complex dependency trees with very few issues and to get consistent application builds.

Categories: Offsite Blogs

Yesod Web Framework: The case for curation: Stackage and the PVP

Planet Haskell - Wed, 11/12/2014 - 5:38am

A number of months back there was a long series of discussions around the Package Versioning Policy (PVP), and in particular the policy of putting in preemptive upper bounds (that is to say, placing upper bounds before it is proven that they are necessary). Eventually, the conversation died down, and I left some points unsaid in the interests of letting that conversation die. Now that Neil Mitchell kicked up the dust again, I may as well get a few ideas out there.

tl;dr Stackage is simply better tooling, and we should be using better tooling instead of arguing policy. The PVP upper bounds advocates are arguing for a world without sin, and such a world doesn't exist. Get started with Stackage now.

This blog post will be a bit unusual. Since I'm so used to seeing questions, criticisms, and misinformation on this topic, I'm going to interject commonly stated memes throughout this blog post and answer them directly. Hopefully this doesn't cause too much confusion.

As most people reading this are probably aware, I manage the Stackage project. I have a firm belief that the PVP discussions we've been having are, essentially, meaningless for the general use case, and simply improving our tool chain is the right answer. Stackage is one crucial component of that improvement.

But Stackage doesn't really help end users, it's nothing more than a CI system for Hackage. The initial Stackage work may have counted as that, but Stackage was never intended to just be behind the scenes. Stackage server provides a very user-friendly solution for solving packaging problems. While I hope to continue improving the ecosystem- together with the Haskell Platform and Cabal maintainers- Stackage server is already a huge leap forward for most users today. (See also: GPS Haskell.)

The PVP is composed of multiple ideas. I'd like to break it into:

  1. A method for versioning packages based on API changes.
  2. How lower bounds should be set on dependencies.
  3. How upper bounds should be set on dependencies.

Just about everyone I've spoken to agrees with the PVP on (1) and (2), the only question comes up with point (3). The arguments go like this: preemptive upper bounds add a lot of maintainer overhead by requiring them to upload new versions of packages to relax version bounds regularly. (This is somewhat mitigated by the new cabal file editing feature of Hackage, but that has its own problems.) On the other hand, to quote some people on Reddit:

I'd rather make a release that relaxes bounds rather than have EVERY previous version suddenly become unusable for folks

that upper bounds should not be viewed as handcuffs, but rather as useful information about the range of dependencies that is known to work. This information makes the solver's job easier. If you don't provide them, your packages are guaranteed to break as t -> ∞.

These statements are simply false. I can guarantee you with absolute certainty that, regardless of the presence of upper bounds, I will be able to continue to build software written against yesod 1.4 (or any other library/version I'm using today) indefinitely. I may have to use the same compiler version and fiddle with shared libraries a bit if I update my OS. But this notion that packages magically break is simply false.

But I have some code that built two months ago, and I opened it today and it doesn't work! I didn't say that the standard Haskell toolchain supports this correctly. I'm saying that the absence of upper bounds doesn't guarantee that a problem will exist.

Without dancing around the issue any further, let me cut to the heart of the problem: our toolchain makes it the job of every end user to find a consistent build plan. Finding such a build plan is inherently a hard problem, so why are we pushing the work downstream? Furthermore, it's terrible practice for working with teams. The entire team should be working in the same package environment, not each working on "whatever cabal-install decided it should try to build today."

There's a well known, time-tested solution to this problem: curation. It's simple: we have a central person/team/organization that figures out consistent sets of packages, and then provides them to downstream users. Downstream users then never have to deal with battling against large sets of dependencies.

But isn't curation a difficult, time-consuming process? How can the Haskell community support that? Firstly, that's not really an important question, since the curation is already happening. Even if it took a full-time staff of 10 people working around the clock, if the work is already done, it's done. In practice, now that the Stackage infrastructure is in place, curation probably averages out to 2 hours of my time a week, unless Edward Kmett decides to release a new version of one of his packages.

This constant arguing around PVP upper bounds truly baffles me, because every discussion I've seen of it seems to completely disregard the fact that there's an improved toolchain around for which all of the PVP upper bound arguments are simply null and void. And let me clarify that statement: I'm not saying Stackage answers the PVP upper bound question. I'm saying that- for the vast majority of users- Stackage makes the answer to the question irrelevant. If you are using Stackage, it makes not one bit of difference to you whether a package has upper bounds or not.

And for the record, Stackage isn't the only solution to the problem that makes the PVP upper bound question irrelevant. Having cabal-install automatically determine upper bounds based on upload dates is entirely possible. I in fact already implemented such a system, and sent it for review to two of the staunchest PVP-upper-bounds advocates I interact with. I didn't actually receive any concrete feedback.

So that brings me back to my point: why are we constantly arguing about this issue which clearly has good arguments on both sides, when we could instead just upgrade our tooling and do away with the problem?

But surely upper bounds do affect some users, right? For one, it affects the people doing curation itself (that's me). I can tell you without any doubt that PVP upper bounds makes my life more difficult during curation. I've figured out ways to work around it, so I don't feel like trying to convince people to change their opinions. It also affects people who aren't using Stackage or some other improved tooling. And my question to those people is: why not?

I'd like to close by addressing the idea that the PVP is "a solution." Obviously that's a vague statement, because we have to define "the problem." So I'll define the problem as: someone types cabal install foo and it doesn't install. Let me count the ways that PVP upper bounds fail to completely solve this problem:

  1. The newest release of foo may have a bug in it, and cabal has no way of knowing it.
  2. One of the dependencies of foo may have a bug in it, and for whatever reason cabal chooses that version.
  3. foo doesn't include PVP upper bounds and a new version of a dependency breaks it. (See comment below if you don't like this point.)
  4. Some of the dependencies of foo don't include PVP upper bounds, and a new version of the transitive dependencies break things.
  5. There's a semantic change in a point release which causes tests to fail. (You do test your environments before using them, right? Because Stackage does.)
  6. I've been really good and included PVP upper bounds, and only depended on packages that include PVP upper bounds. But I slipped up and had a mistake in a cabal file once. Now cabal chooses an invalid build plan.
  7. All of the truly legitimate reasons why the build may fail: no version of the package was ever buildable, no version of the package was ever compatible with your version of GHC or OS, it requires something installed system wide that you don't have, etc.

That's not fair, point (3) says that the policy doesn't help if you don't follow it, that's a catch 22! Nope, that's exactly my point. A policy on its own does not enforce anything. A tooling solution can enforce invariants. Claiming that the PVP will simply solve dependency problems is built around the idea of universal compliance, lack of mistakes, historical compliance, and the PVP itself covering all possible build issues. None of these claims hold up in the real world.

To go comically over the top: assuming the PVP will solve dependency problems is hoping to live in a world without sin. We must accept the PVP into our hearts. If we have a build problem, we must have faith that it is because we did not trust the PVP truly enough. The sin is not with the cabal dependency solver, it is with ourselves. If we ever strayed from the path of the PVP, we must repent of our evil ways, and return unto the PVP, for the PVP is good. I'm a religious man. My religion just happens to not be the PVP.

I'm not claiming that Stackage solves every single reason why a build fails. The points under (7), for example, are not addressed. However, maybe of the common problems people face- and, I'd argue, the vast majority of issues that confuse and plague users- are addressed by simply moving over to Stackage.

If you haven't already, I highly recommend you give Stackage a try today.

Categories: Offsite Blogs

Using multiple versions of the Haskell Platform onWindows

haskell-cafe - Wed, 11/12/2014 - 4:43am
The win-hp-path project provides the use-hp command, which makes it easy to switch between different versions of Haskell Platform on Windows. https://github.com/ygale/win-hp-path We are using it for running cabal and GHC in command prompt windows. In particular, we can also use it in build scripts used by the build management team who are not Haskell programmers. Please let me know if you can use this in more complex dev environments, or if you have suggestions about how it could be enhanced to do that. Pull requests are welcome. Thanks, Yitz
Categories: Offsite Discussion

Final bikeshedding call: Fixing Control.Exception.bracket

libraries list - Tue, 11/11/2014 - 8:09pm
Ola! In September Eyal Lotem raised the issue of bracket's cleanup handler not being uninterruptible [1]. This is a final bikeshedding email before I submit a patch. The problem, summarised: Blocking cleanup actions can be interrupted, causing cleanup not to happen and potentially leaking resources. Main objection to making the cleanup handler uninterruptible: Could cause deadlock if the code relies on async exceptions to interrupt a blocked thread. I count only two objections in the previous thread, 1 on the grounds that "deadlocks are NOT unlikely" and 1 that is conditioned on "I don't believe this is a problem". The rest seems either +1, or at least agrees that the status quo is *worse* than the proposed solution. My counter to these objections is: 1) No one has yet shown me any code that relies on the cleanup handler being interruptible 2) There are plenty of examples of current code being broken, for example every single 'bracket' using file handles is broken due to handle operations using a pote
Categories: Offsite Discussion

Monad m => m (Maybe a) -> m (Maybe a) -> m (Maybe a)

haskell-cafe - Tue, 11/11/2014 - 6:57pm
I've been using these functions lately: try :: Monad m => m (Maybe a) -> m (Maybe a) -> m (Maybe a) try action alternative = maybe alternative (return . Just) =<< action tries :: Monad m => [m (Maybe a)] -> m (Maybe a) tries = foldr try (return Nothing) It's sort of like (<|>) on Maybe, or MonadPlus, but within a monad. It seems like the sort of thing that should be already available, but hoogle shows nothing. I think 'm' has to be a monad, and I can't figure out how to generalize the Maybe to MonadPlus or Alternative. It's sort of a mirror image to another function I use a lot: justm :: Monad m => m (Maybe a) -> (a -> m (Maybe b)) -> m (Maybe b) justm op1 op2 = maybe (return Nothing) op2 =<< op1 ... which is just MaybeT for when I can't be bothered to put runMaybeT and lifts and hoists on everything. So you could say 'try' is like MaybeT with the exceptional case reversed. Is 'try' just the instantiation of some standard typeclass, or is it its own thing?
Categories: Offsite Discussion

github.com

del.icio.us/haskell - Tue, 11/11/2014 - 6:31pm
Categories: Offsite Blogs