You know Haskell is an outstanding language for parallel and concurrent programming, and you know how fast Haskell code can be. FP Complete, the leading Haskell company, is working on a powerful, scalable medical application. We need a senior Haskell developer to work on high performance computing and related projects.
You’ll create a general purpose set of libraries and tools for performing big distributed computations that run fast and are robust, and general-purpose tools to monitor such programs at runtime and both profile and debug them. Using your work we’ll perform very large volumes of computation per CPU, across multiple cores and across multiple servers.
This is a telecommute position which can be filled from anywhere, with some preference given to applicants in North America and especially the San Francisco and San Diego areas. You should have these skills:
- overall strong Haskell application coding ability
- experience building reusable components/packages/libraries,
- experience writing high-throughput computations in math, science, finance, engineering, big data, graphics, or other performance-intensive domains,
- a solid understanding of distributed computing,
- an understanding of how to achieve low-latency network communication,
- knowledge of how to profile Haskell applications effectively,
- work sharing or distributed computing algorithms,
- multicore/parallel application development.
These further skills are a plus:
- experience building scalable server/Web applications,
- experience tuning GHC’s runtime options for high performance,
- some experience working on GHC internals,
- knowledge of Cloud Haskell,
- knowledge of Threadscope or similar profiling tools,
- an understanding of the implementation of GHC’s multithreaded runtime,
- experience as a technical lead and/or a manager,
- experience as an architect and/or a creator of technical specs.
In addition to these position-specific skills, candidates are expected to have clear written and verbal communication in English, the ability to work well within a distributed team and experience with typical project tools (Git or similar revision control system, issue tracker, etc).
Please submit your application to email@example.com. A real member of our engineering team will read it. In addition to a resume/CV, we very much like to see any open source work that you can point to in the relevant domains, or comparable source code you can show us.
Installing Haskell packages is still a pain. But I believe the community has some good enough workarounds that puts Haskell on par with a lot of other programming languages. The problem is mostly that the tools and techniques are newer, do not always integrate easily, and are still lacking some automation.
My strategy for successful installation:
- Install through Stackage
- Use a sandbox when you start having complexities
- freeze (application) dependencies
- Stackage is Stable Hackage: a curated list of packages that are guaranteed to work together
- A sandbox is a project-local package installation
- Freezing is specifying exact dependency versions.
I really hope that Stackage (and sandboxes to a certain extent) are temporary workarounds before we have an amazing installation system such as backpack. But right now, I think this is the best general-purpose solution we have. There are other tools that you can use if you are not on Windows:
- hsenv (instead of sandboxes)
- nix (instead of Stackage and sandboxes)
hsenv has been a great tool that I have used in the past, but I personally don't think that sandboxing at the shell level with hsenv is the best choice architecturally. I don't want to have a sandbox name on my command line to remind me that it is working correctly, I just want cabal to handle sandboxes automatically.Using Stackage
See the Stackage documentation. You just need to change the remote-repo setting in your ~/.cabal/config file.
Stackage is a curated list of packages that are guaranteed to work together. Stackage solves dependency hell with exclusive and inclusive package snapshots, but it cannot be used on every project.
Stackage offers 2 package lists: exclusive, and inclusive. Exclusive includes only packages vetted by Stackage. Exclusive will always work, even for global installations. This has the nice effect of speeding up installation and keeping your disk usage low, whereas if you default to using sandboxes and you are making minor fixes to libraries you can end up with huge disk usage. However, you may eventually need packages not on Stackage, at which point you will need to use the inclusive snapshot. At some point you will be dealing with conflicts between projects, and then you definitely need to start using sandboxes. The biggest problem with Stackage is that you may need a newer version of a package than what is on the exclusive list. At that point you definitely need to stop using Stackage and start using a sandbox.
If you think a project has complex dependencies, which probably includes most applications in a team work setting, you will probably want to start with a sandbox.Sandboxescabal sandbox init
A sandbox is a project-local package installation. It solves the problem of installation conflicts with other projects (either actively over-writing each-other or passively sabotaging install problems). However, the biggest problem with sandboxes is that unlike Stackage exclusive, you still have no guarantee that cabal will be able to figure out how to install your dependencies.
sandboxes are mostly orthogonal to Stackage. If you can use Stackage exclusive, you should, and if you never did a cabal update, you would have no need for a sandbox with Stackage exclusive. When I am making minor library patches, I try to just use my global package database with Stackage to avoid bloating disk usage from redundant installs.
So even with Stackage we are going to end up wanting to create sandboxes. But we would still like to use Stackage in our sandbox: this will give us the highest probability of a successful install. Unfortunately, Stackage (remote-repo) integration does not work for a sandbox.
The good news is that there is a patch for Cabal that has already been merged (but not yet released). Even better news is that you can use Stackage with a sandbox today! Cabal recognizes a cabal.config file which specifies a list of constraints that must be met, and we can set that to use Stackage.cabal sandbox init curl http://www.stackage.org/alias/fpcomplete/unstable-ghc78-exclusive/cabal.config > cabal.config cabal install --only-depFreezing
There is a problem with our wonderful setup: what happens when our package is installed on another location? If we are developing a library, we need to figure out how to make it work everywhere, so this is not as much of an issue.
Application builders on the other hand need to produce reliable, re-producible builds to guarantee correct application behavior. Haskellers have attempted to do this in the .cabal file by pegging versions. But .cabal file versioning is meant for library authors to specify maximum version ranges that a library author hopes will work with their package. Pegging packages to specific versions in a .cabal file will eventually fail because there are dependencies of dependencies that are not listed in the .cabal file and thus not pegged. The previous section's usage of a cabal.config has a similar issue since only packages from Stackage are pegged, but Hackage packages are not.
The solution to this is to freeze your dependencies:cabal freeze
This writes out a new cabal.config (overwriting any existing cabal.config). Checking in this cabal.config file guarantees that everyone on your team will be able to reproduce the exact same build of Haskell dependencies. That gets us into upgrade issues that will be discussed.
It is also worth noting that there is still a rare situation in which freezing won't work properly because packages can be edited on Hackage.Installation workflow
Lets go over an installation workflow:cabal sandbox init curl http://www.stackage.org/alias/fpcomplete/unstable-ghc78-exclusive/cabal.config > cabal.config cabal install --only-dep
An application developer will then want to freeze their dependencies.cabal freeze git add cabal.config git commit cabal.configUpgrading packages
cabal-install should provide us with a cabal upgrade [PACKAGE-VERSION] command. That would perform an upgrade of the package to the version specified, but also perform a conservative upgrade of any transitive dependencies of that package. Unfortunately, we have to do upgrades manually.
One option for upgrading is to just wipe out your cabal.config and do a fresh re-install.rm cabal.config rm -r .cabal-sandbox cabal sandbox init curl http://www.stackage.org/alias/fpcomplete/unstable-ghc78-exclusive/cabal.config > cabal.config cabal update cabal install --only-dep cabal freeze
With this approach all your dependencies can change so you need to re-test your entire application. So to make this more efficient you are probably going to want to think about upgrading more dependencies than what you originally had in mind to avoid doing this process again a week from now.
The other extreme is to become the solver. Manually tinker with the cabal.config until you figure out the upgrade plan that cabal install --only-dep will accept. In between, you can attempt to leverage the fact that cabal already tries to perform conservative upgrades once you have packages installed.rm cabal.config curl http://www.stackage.org/alias/fpcomplete/unstable-ghc78-exclusive/cabal.config > cabal.config cabal update cabal install --only-dep --force-reinstalls cabal freeze
You can make a first attempt without the --force-reinstalls flag, but the flag is likely to be necessary.
If you can no longer use Stackage because you need newer versions of the exclusive packages, then your workflow will be the same as above without the curl step. But you will have a greater desire to manually tinker with the cabal.config file. This process usually consists mostly of deleting constraints or changing them to be a lower bound.Conclusion
Upgrading packages is still a horrible experience.
However, for a fresh install, using Stackage, sandboxes, and freezing works amazingly well. Of course, once you are unable to use Stackage because you need different exclusive versions you will encounter installation troubles. But if you originally started based off of Stackage and try to perform conservative upgrades, you may still find your situation easier to navigate because you have already greatly reduced the search space for cabal. And if you are freezing versions and checking in the cabal.config, the great thing is that you can experiment with installing new dependencies but can always revert back to the last known working dependencies.
Using these techniques I am able to get cabal to reliably install complex dependency trees with very few issues and to get consistent application builds.
A number of months back there was a long series of discussions around the Package Versioning Policy (PVP), and in particular the policy of putting in preemptive upper bounds (that is to say, placing upper bounds before it is proven that they are necessary). Eventually, the conversation died down, and I left some points unsaid in the interests of letting that conversation die. Now that Neil Mitchell kicked up the dust again, I may as well get a few ideas out there.
tl;dr Stackage is simply better tooling, and we should be using better tooling instead of arguing policy. The PVP upper bounds advocates are arguing for a world without sin, and such a world doesn't exist. Get started with Stackage now.
This blog post will be a bit unusual. Since I'm so used to seeing questions, criticisms, and misinformation on this topic, I'm going to interject commonly stated memes throughout this blog post and answer them directly. Hopefully this doesn't cause too much confusion.
As most people reading this are probably aware, I manage the Stackage project. I have a firm belief that the PVP discussions we've been having are, essentially, meaningless for the general use case, and simply improving our tool chain is the right answer. Stackage is one crucial component of that improvement.
But Stackage doesn't really help end users, it's nothing more than a CI system for Hackage. The initial Stackage work may have counted as that, but Stackage was never intended to just be behind the scenes. Stackage server provides a very user-friendly solution for solving packaging problems. While I hope to continue improving the ecosystem- together with the Haskell Platform and Cabal maintainers- Stackage server is already a huge leap forward for most users today. (See also: GPS Haskell.)
The PVP is composed of multiple ideas. I'd like to break it into:
- A method for versioning packages based on API changes.
- How lower bounds should be set on dependencies.
- How upper bounds should be set on dependencies.
Just about everyone I've spoken to agrees with the PVP on (1) and (2), the only question comes up with point (3). The arguments go like this: preemptive upper bounds add a lot of maintainer overhead by requiring them to upload new versions of packages to relax version bounds regularly. (This is somewhat mitigated by the new cabal file editing feature of Hackage, but that has its own problems.) On the other hand, to quote some people on Reddit:
I'd rather make a release that relaxes bounds rather than have EVERY previous version suddenly become unusable for folks
that upper bounds should not be viewed as handcuffs, but rather as useful information about the range of dependencies that is known to work. This information makes the solver's job easier. If you don't provide them, your packages are guaranteed to break as t -> ∞.
These statements are simply false. I can guarantee you with absolute certainty that, regardless of the presence of upper bounds, I will be able to continue to build software written against yesod 1.4 (or any other library/version I'm using today) indefinitely. I may have to use the same compiler version and fiddle with shared libraries a bit if I update my OS. But this notion that packages magically break is simply false.
But I have some code that built two months ago, and I opened it today and it doesn't work! I didn't say that the standard Haskell toolchain supports this correctly. I'm saying that the absence of upper bounds doesn't guarantee that a problem will exist.
Without dancing around the issue any further, let me cut to the heart of the problem: our toolchain makes it the job of every end user to find a consistent build plan. Finding such a build plan is inherently a hard problem, so why are we pushing the work downstream? Furthermore, it's terrible practice for working with teams. The entire team should be working in the same package environment, not each working on "whatever cabal-install decided it should try to build today."
There's a well known, time-tested solution to this problem: curation. It's simple: we have a central person/team/organization that figures out consistent sets of packages, and then provides them to downstream users. Downstream users then never have to deal with battling against large sets of dependencies.
But isn't curation a difficult, time-consuming process? How can the Haskell community support that? Firstly, that's not really an important question, since the curation is already happening. Even if it took a full-time staff of 10 people working around the clock, if the work is already done, it's done. In practice, now that the Stackage infrastructure is in place, curation probably averages out to 2 hours of my time a week, unless Edward Kmett decides to release a new version of one of his packages.
This constant arguing around PVP upper bounds truly baffles me, because every discussion I've seen of it seems to completely disregard the fact that there's an improved toolchain around for which all of the PVP upper bound arguments are simply null and void. And let me clarify that statement: I'm not saying Stackage answers the PVP upper bound question. I'm saying that- for the vast majority of users- Stackage makes the answer to the question irrelevant. If you are using Stackage, it makes not one bit of difference to you whether a package has upper bounds or not.
And for the record, Stackage isn't the only solution to the problem that makes the PVP upper bound question irrelevant. Having cabal-install automatically determine upper bounds based on upload dates is entirely possible. I in fact already implemented such a system, and sent it for review to two of the staunchest PVP-upper-bounds advocates I interact with. I didn't actually receive any concrete feedback.
So that brings me back to my point: why are we constantly arguing about this issue which clearly has good arguments on both sides, when we could instead just upgrade our tooling and do away with the problem?
But surely upper bounds do affect some users, right? For one, it affects the people doing curation itself (that's me). I can tell you without any doubt that PVP upper bounds makes my life more difficult during curation. I've figured out ways to work around it, so I don't feel like trying to convince people to change their opinions. It also affects people who aren't using Stackage or some other improved tooling. And my question to those people is: why not?
I'd like to close by addressing the idea that the PVP is "a solution." Obviously that's a vague statement, because we have to define "the problem." So I'll define the problem as: someone types cabal install foo and it doesn't install. Let me count the ways that PVP upper bounds fail to completely solve this problem:
- The newest release of foo may have a bug in it, and cabal has no way of knowing it.
- One of the dependencies of foo may have a bug in it, and for whatever reason cabal chooses that version.
- foo doesn't include PVP upper bounds and a new version of a dependency breaks it. (See comment below if you don't like this point.)
- Some of the dependencies of foo don't include PVP upper bounds, and a new version of the transitive dependencies break things.
- There's a semantic change in a point release which causes tests to fail. (You do test your environments before using them, right? Because Stackage does.)
- I've been really good and included PVP upper bounds, and only depended on packages that include PVP upper bounds. But I slipped up and had a mistake in a cabal file once. Now cabal chooses an invalid build plan.
- All of the truly legitimate reasons why the build may fail: no version of the package was ever buildable, no version of the package was ever compatible with your version of GHC or OS, it requires something installed system wide that you don't have, etc.
That's not fair, point (3) says that the policy doesn't help if you don't follow it, that's a catch 22! Nope, that's exactly my point. A policy on its own does not enforce anything. A tooling solution can enforce invariants. Claiming that the PVP will simply solve dependency problems is built around the idea of universal compliance, lack of mistakes, historical compliance, and the PVP itself covering all possible build issues. None of these claims hold up in the real world.
To go comically over the top: assuming the PVP will solve dependency problems is hoping to live in a world without sin. We must accept the PVP into our hearts. If we have a build problem, we must have faith that it is because we did not trust the PVP truly enough. The sin is not with the cabal dependency solver, it is with ourselves. If we ever strayed from the path of the PVP, we must repent of our evil ways, and return unto the PVP, for the PVP is good. I'm a religious man. My religion just happens to not be the PVP.
I'm not claiming that Stackage solves every single reason why a build fails. The points under (7), for example, are not addressed. However, maybe of the common problems people face- and, I'd argue, the vast majority of issues that confuse and plague users- are addressed by simply moving over to Stackage.
If you haven't already, I highly recommend you give Stackage a try today.