News aggregator

Well-Typed.Com: How we might abolish Cabal Hell, part 2

Planet Haskell - Wed, 01/28/2015 - 7:29am

In part 1 we looked at an overview of all the various problems that make up “Cabal hell”. We also looked at an overview of a few solutions and how they overlap.

In part 2 and part 3 we’ll look in more detail at the two major solutions to Cabal hell. In this post we’ll look at Nix-style package management, and in the next post we’ll look at curated package collections.

A reminder about what the problems are

You might recall from part 1 this slightly weird diagram of the symptoms of Cabal Hell:

 

In this part we’re going to pick out a couple of the problems and look them in a bit more detail:

  • Breaking re-installations
  • Type errors when using packages together
Breaking re-installations

There are situations where Cabal’s chosen solution would involve reinstalling an existing version of a package but built with different dependencies.

For example, suppose I have an application app and it depends on zlib and bytestring. The zlib package also depends on bytestring. Initially the versions I am using for my app-1.0 are zlib-0.5 and bytestring-0.9 (Versions simplified for presentation.) Now I decide in the next version of my app, app-1.1, that I want to use a nice new feature in bytestring version 0.10. So I ask cabal to build my application using a more recent bytestring. The cabal tool will come up with a solution using the new bytestring-0.10 and the same old version of zlib-0.5. Building this solution will involve installing bytestring-0.10 and rebuilding zlib-0.5 against it.

 

What is the problem here?

The problem is the rebuilding of zlib-0.5.

Why is this a problem?

It is a problem because when we install the instance “zlib-0.5 built against bytestring-0.10” we have to delete the pre-existing instance “zlib-0.5 built against bytestring-0.9”. Anything that depended on that previous instance now has a dangling reference and so is effectively broken.

Why do we have to delete the previous instance?

The way installed packages are managed is such that each GHC package database can only have one instance of each package version. Note that having two different versions would be allowed, e.g. zlib-0.4 and zlib-0.5, but having two instances of zlib-0.5 is not allowed. Not deleting the previous instance would involve having two instances of zlib-0.5 (each instance built against different versions of bytestring).

So the root of the problem is having to delete – or if you like, mutate – package instances when installing new packages. Mutable state strikes again! And this is due to the limitation of only being able to have one instance of a package version installed at once.

Type errors when using packages together

The second, orthogonal, problem is that it is possible to install two packages and then load them both in GHCi and find that you get type errors when composing things defined in the two different packages. Effectively you cannot use these two particular installed packages together.

The reason is that the two packages have been built using different versions of some common dependency. For example, I might have zlib built against bytestring-0.9 and binary built against bytestring-0.10. Now if I try to use them together in GHCi (e.g. compressing the result of binary serialisation) then I will get a type error from GHC saying that bytestring-0.9:Data.ByteString.ByteString is not the same type as bytestring-0.10:Data.ByteString.ByteString. And GHC is not wrong here, we really cannot pretend that these are the same types (at least not in general).

 

This is a variant on the classic “diamond dependency problem” that cabal solved years ago. So why do we still get it? In fact we never hit this problem when building a package with cabal, because cabal’s solver looks for “consistent” solutions. (Recall from part 1, that when we say a “consistent” solution we mean one that uses only one version of any package.)

We can still hit this problem when we install things separately and then use the packages directly with ghc or ghci. This is because cabal does not enforce consistency in the developer’s environment. It only enforces consistency within any set of packages it installs simultaneously.

The fundamental problem is that developers expect to be able to use combinations of their installed packages together, but the package tools do not enforce consistency of the developer’s environment.

In practice this class of problem is currently relatively rare. In part that is because one would often hit the problem above involving re-installing and breaking packages. If however we lifted the limitation on installing multiple instances of the same version of a package, then we would always be able to install new versions and we would instead hit this problem much more frequently.

Nix-style package management

The ideas behind Nix are now over 10 years old. When first reading the published papers on Nix some years ago I was struck by how simple and elegant they are. They also appear to work well in practice, with a full Linux distribution based on them, plus a number of other well-regarded tools.

The key ideas are:

  • A persistent package store. That is, persistent in the sense of an immutable functional data type. Packages are added but never mutated. Old packages can be garbage collected. Just like immutable data structures in Haskell, immutable package stores have several advantages.

  • Working environment(s) that are “views” into the package store. A working environment has a number of packages available, and this is just a subset of all the packages that are installed in the package store. There can be many of these that co-exist and switching the view is simple and cheap.

Contrast this to a traditional approach, e.g. what we have now with Cabal, or Linux distros based on .rpm or .deb. In the traditional approach there is a single environment which is exactly the same as the full set of installed packages. So compared to a traditional approach, we have an extra level of indirection. Having views lets us have a much larger collection of packages installed than we have available in the working environment, and allows multiple environments.

A good illustration of what these ideas give us is to see what happens when we want to add a new package:

  1. We start with our initial environment which points to a bunch of packages from the store.

  2. We compile and install a new package into the store. So far this changes very little, which is a big feature! In particular it cannot break anything. The new installed package can co-exist with all the existing ones. There can be no conflicts (the install paths are arranged to guarantee this). No existing packages have been modified or broken. No environments have yet been changed.

  3. Now we have choices about what to do with environments. Our existing environment is unchanged and does not contain the new package. We can create a new environment that consists of the old environment with the extra package added. In principle both the old and the new environments exist. This is very much like a persistent functional data structure:

    let env' = Map.insert pkgname pkg env

    Both exist, the old one is not changed, and we can decide if we are only interested in the new one or if we want to keep both.

  4. Finally we can, if we want, switch our “current” environment to be the new environment with the new package. So while multiple environments can exist, only one of them is active in our current shell session.

So what have we gained from the added indirection of a persistent store + views?

  • We can install new things without breaking anything, guaranteed.

  • We get atomic rollback if we don’t like the changes we made.

  • Multiple independent environments.

  • Environments that are very cheap to create.

The multiple environments effectively gives us sandboxes for free. In fact it’s better because we can easily share artefacts between sandboxes when we want to. That means far fewer rebuilds, and easy global installation of things built in an isolated environment.

Nix has a few other good ideas, like its functional package description language and some clever tricks for dealing with the messy details of system packages. The key ideas however are the ones I’ve just outlined, and they are the ones that we should steal for GHC/Cabal.

Mechanisms, policies and workflows

The package store and the multiple environments are just a mechanism, not a user interface. The mechanisms are also mostly policy free.

Generally we should start from user requirements, use cases, workflows and the like, and work out a user interface and then decide on mechanisms that will support that user interface. That said, I think it is clear that the Nix mechanisms are sound and sufficiently general and flexible that they will cover pretty much any user interface we decide we want.

So I think our design process should be:

  1. Look at the packaging tool requirements, use cases, workflows etc, and work out a user interface design.

  2. Then figure out how the actions in that user interface translate into operations on the Nix package store and environments.

Addressing the Cabal Hell problems

The Nix approach deals nicely with the problem of breaking re-installations. The Nix mechanisms guarantee that installed packages are never mutated, so no existing installed packages ever break.

The Nix mechanisms do not deal directly with the issue of type errors when using packages together. As we noted before, that requires enforcing the consistency of the developer’s environment. In Nix terms this is a policy about how we manage the Nix environment(s). The policy would be that each environment contains only one version of each package and it would guarantee that all packages in an environment can be used together.

Without wishing to prejudge the future user interface for cabal, I think this is a policy that we should adopt.

Enforcing consistency does have implications for the user interface. There will be situations where one wants to install a new package, but it is impossible to add it to the current environment while keeping all of the existing packages. For example, suppose we have two different web stacks that have many packages in common but that require different versions of some common package. In that case we could not have a consistent environment that contains both. Thus the user interface will have to do something when the user asks to add the second web stack to an environment that already contains the first. The user interface could minimise the problem by encouraging a style of use where most environments are quite small, but it cannot be avoided in general.

While what I am suggesting for consistency is relatively strong, we cannot get away without enforcing some restrictions on the environment. For example if our environment did contain two instances of the same version of a package then which one would we get when we launch GHCi? So my view is that given that we cannot avoid the user interface issues with environment consistency, it is better to go for the stronger and more useful form.

In fact we’ve already been experimenting in this direction. The current cabal sandbox feature does enforce consistency within each sandbox. This seems to work ok in practice because each sandbox environment is relatively small and focused on one project. Interestingly we have had some pressure to relax this constraint due to the cost of creating new sandboxes in terms of compiling. (Allowing some inconsistencies in the environment allows the common packages to be shared and thus only compiled once.) Fortunately this issue is one that is nicely solved by Nix environments which are extremely cheap to create because they allow sharing of installed packages.

Implementation progress

We’ve been making small steps towards the Nix design for many years now. Several years ago we changed GHC’s package database to use the long opaque package identifiers that are necessary to distinguish package instances.

More recently Philipp Schuster did a GSoC project looking into the details of what we need to do to incorporate the Nix ideas within GHC and Cabal. You can see the slides and video of his HiW presentation. We learned a lot, including that there’s quite a lot of work left to do.

Last year Edward Yang and Simon PJ (with advice from Simon Marlow and myself) started working on implementing the “Backpack” package system idea within GHC and Cabal. Backpack also needs to be able to manage multiple instances of packages with the same version (but different deps) and so it overlaps quite a lot with what we need for Nix-style package management in GHC. So Edward’s work has dealt with many of the issues in GHC that will be needed for Nix-style package management.

Another small step is that in GHC 7.10 we finally have the ability to register multiple instances of the same version of a package, and we have the basic mechanism in GHC to support multiple cheap environments (using environment files). Both of these new GHC features are opt-in so that they do not break existing tools.

The remaining work is primarily in the cabal tool. In particular we have to think carefully about the new user interface and how it maps into the Nix mechanisms.

So there has been a lot of progress and if we can keep making small useful steps then I think we can get there. Of course it would help to focus development efforts on it, perhaps with a hackathon or two.

Conclusion

Implementing the Nix approach in GHC/Cabal would cover a major part of the Cabal Hell problems.

In the next post we’ll look at curated package collections, which solves a different (but slightly overlapping) set of Cabal Hell problems. Nix-style package management and curated package collections are mostly complementary and we want both.

Categories: Offsite Blogs

Dominic Orchard: Constraint kinds in Haskell, finally bringing us constraint families

Planet Haskell - Wed, 01/28/2015 - 6:16am

Back in 2009 Tom Schrijvers and I wrote a paper entitled Haskell Type Constraints Unleashed [1] which appeared at FLOPS 2010 in April. In the paper we fleshed out the idea of adding constraint synyonyms and constraint families to GHC/Haskell, building upon various existing proposals for class families/indexed constraints. The general idea in our paper, and in the earlier proposals, is to add a mechanism to GHC/Haskell to allow constraints, such as type class or equality constraints, to be indexed by a type in the same way that type families and data families allow types to be indexed by types.

As an example of why constraint families are useful, consider the following type class which describes a simple, polymorphic, embedded language in Haskell (in the “finally tagless“-style [2]) (this example appears in [1]):

class Expr sem where constant :: a -> sem a add :: sem a -> sem a -> sem a

Instances of Expr provide different evaluation semantics for the language. For example, we might like to evaluate the language for numerical values, so we might try and write the following instance:

data E a = E {eval :: a} instance Expr E where constant c = E c add e1 e2 = E $ (eval e1) + (eval e2)

However, this instance does not type check. GHC returns the type error:

No instance for (Num a) arising from a use of `+' In the second argument of `($)', namely `(eval e1) + (eval e2)' ...

The + operation requires the Num a constraint, yet the signature for add states no constraints on type variable a, thus the constraint is never satisfied in this instance. We could add the Num a constraint to the signature of add, but this would restrict the polymorphism of the language: further instances would have this constraint forced upon them. Other useful semantics for the language may require other constraints e.g. Show a for pretty printing. Should we just add more and more constraints to the class? By no means!

Constraint families, as we describe in [1], provide a solution: by associating a constraint family to the class we can vary, or index, the constraints in the types of add and constant by the type of an instance of Expr. The solution we suggest looks something likes:

class Expr sem where constraint Pre sem a constant :: Pre sem a => a -> sem a add :: Pre sem a => sem a -> sem a -> sem a instance Expr E where constraint Pre E a = Num a ... -- methods as before

Pre is the name of a constraint family that takes two type parameters and returns a constraint, where the first type parameter is the type parameter of the Expr class.

We could add some further instances:

data P a = P {prettyP :: String} instance Expr P where constraint Pre P a = Show a constant c = P (show c) add e1 e2 = P $ prettyP e1 ++ " + " ++ prettyP e2

e.g.

*Main> prettyP (add (constant 1) (constant 2)) "1 + 2"

At the time of writing the paper I had only a prototype implementation in the form of a preprocessor that desugared constraint families into some equivalent constructions (which were significantly more awkward and ugly of course). There has not been a proper implementation in GHC, or of anything similar. Until now.

At CamHac, the Cambridge Haskell Hackathon, last month, Max Bolingbroke started work on an extension for GHC called “constraint kinds”. The new extensions unifies types and constraints such that the only distinction is that constraints have a special kind, denoted Constraint. Thus, for example, the |Show| class constructor is actually a type constructor, of kind:

Show :: * -> Constraint

For type signatures of the form C => t, the left-hand side is now a type term of kind Constraint. As another example, the equality constraints constructor ~ now has kind:

~ :: * -> * -> Constraint

i.e. it takes two types and returns a constraint.

Since constraints are now just types, existing type system features on type terms, such as synonyms and families, can be reused on constraints. Thus we can now define constraint synonyms via standard type synonms e.g.

type ShowNum a = (Num a, Show a)

And most excitingly, constraint families can be defined via type families of return kind Constraint. Our previous example can be written:

class Expr sem where type Pre sem a :: Constraint constant :: Pre sem a => a -> sem a add :: Pre sem a => sem a -> sem a -> sem a instance Expr E where type Pre E a = Num a ...

Thus, Pre is a type family of kind * -> * -> Constraint. And it all works!

The constraint kinds extension can be turned on via the pragma:

{-# LANGUAGE ConstraintKinds #-}

Max has written about the extension over at his blog, which has some more examples, so do go check that out. As far as I am aware the extension should be hitting the streets with version 7.4 of GHC. But if you can’t wait it is already in the GHC HEAD revision so you can checkout a development snapshot and give it a whirl.

In my next post I will be showing how we can use the constraint kinds extension to describe abstract structures from category theory in Haskell that are defined upon subcategories. I already have a draft note about this (edit July 2012: draft note subsumed by my TFP 2012 submission)
submission if you can’t wait!

 

References


[1] Orchard, D. Schrijvers, T.: Haskell Type Constraints Unleashed, FLOPS 2010
, [author’s copy with corrections] [On SpringerLink]

[2] Carrete, J., Kiselyov, O., Shan, C. C.: Finally Tagless, Partially Evaluated, APLAS 2007


Categories: Offsite Blogs

Don Stewart (dons): Contracting Haskell development role at Standard Chartered

Planet Haskell - Wed, 01/28/2015 - 5:44am

The Strats team at Standard Chartered is hiring a developer for a 1 year contracting role in London.

The role is to develop and extend our parsing and validation library for FpML, using the FpML Haskell library to parse and build financial product data into our internal Haskell data types. You will extend our library to support more products, as well as building testing and validation tools (e.g. in QuickCheck) to confirm the soundness and completeness of the implementation.

Ideally you have at least a year’s experience with Haskell. A recent computer science graduate with a few Haskell (or similar typed FP language) courses and maybe open source contributions would be ideal. Advantages if you already have experience with Parsec-like combinator libraries and /or QuickCheck.

You will join an expert team in London, and this is a great opportunity to gain experience using Haskell in production as part of a very experienced team. (We have more than 2 million lines of Haskell, and our own Haskell compiler).

The role requires physical presence on the trading floor in London. Remote work isn’t an option, though some work from home may be possible. No financial background is required.

More info about our development process is in the 2012 PADL keynote, and a 2013 HaskellCast interview.

If this sounds exciting to you, please send CVs to me – donald.stewart @ sc.com


Tagged: jobs, london
Categories: Offsite Blogs

Arrays - HaskellWiki

del.icio.us/haskell - Wed, 01/28/2015 - 2:50am
Categories: Offsite Blogs

Upcoming GHC version (was: Re: Drastic Prelude changes imminent)

libraries list - Wed, 01/28/2015 - 1:16am
If BBP is going to be released as-is, can we please call the next release GHC 8.0? Good stuff: http://semver.org/ Thanks, Greg On Tue, Jan 27, 2015 at 10:22 AM, David Luposchainsky <dluposchainsky< at >googlemail.com> wrote:
Categories: Offsite Discussion

Using GHC API to compile Haskell sources to CORE andCORE to binary

haskell-cafe - Wed, 01/28/2015 - 1:10am
Hello All! :) Recently I've came across a problem and hopefully you could help me with it. Basically I and some people here are running a startup and we have created a language for processing images. This language has 2 interchangeable representations - textual and visual, dataflow one. Right now we are compiling it to haskell sources, but we want to move over GHC CORE. Getting into GHC has a high learning curve, as far as we see, and we would be very thankfull for any help with a minimal working example of compiling haskell sources to CORE and then CORE to binaries. I've posted couple days ago a detailed question on StackOverflow, here: http://stackoverflow.com/questions/28059669/using-ghc-api-to-compile-haskell-sources-to-core-and-core-to-binary I would be very thankful for any help or hints! All the best, Wojciech _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe< at >haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Categories: Offsite Discussion

runSubStateT

haskell-cafe - Wed, 01/28/2015 - 12:14am
runSubStateT :: Monad m => (s -> s') -> (s' -> s) -> StateT s' m a -> StateT s m a runSubStateT to from m = StateT (\s -> liftM (\(a,s') -> (a,from s')) (runStateT m (to s))) Anyone ever needed something like this? Does it already exist in some form in the standard libraries? Ciao
Categories: Offsite Discussion

CFP: WPTE 2015 Second International Workshop on Rewriting Techniques for Program Transformations and Evaluation

General haskell list - Tue, 01/27/2015 - 8:06pm
FIRST CALL FOR PAPERS Second International Workshop on Rewriting Techniques for Program Transformations and Evaluation WPTE 2015 affiliated with RDP 2015 2 July, 2015, Warsaw, Poland http://www.trs.cm.is.nagoya-u.ac.jp/event/wpte2015/ Aims and Scope ============== The aim of WPTE is to bring together the researchers working on program transformations, evaluation, and operationally-based programming language semantics, using rewriting methods, in order to share the techniques and recent developments and to exchange ideas to encourage further activation of research in this area. The previous WPTE was held in Vienna 2014. Topics of interest in the scope of this workshop include: * Correctness of program transformations, optimizations and translations. * Program transformations for proving termination, confluence and other properties. * Correctness of ev
Categories: Incoming News

Error trying to re-upload to the Haskell wiki

haskell-cafe - Tue, 01/27/2015 - 7:30pm
L.S., I am trying to upload a file to the Haskell wiki again, after I deleted it (the description was wrong and there is no option to edit the description). Every time I try this, I get the message: Remote server or file not found You tried to access the address https://wiki.haskell.org/Special:Upload, which is currently unavailable. The file is not visible in the upload log. How can I upload this file again? Regards, Henk-Jan van Tuyl
Categories: Offsite Discussion

Representation of lenses

haskell-cafe - Tue, 01/27/2015 - 7:18pm
Hi, I'm planning on discussing lenses with some colleagues shortly so I'm consolidating some of my notes into a coherent story. At one point I found myself describing lenses as a way of packaging up a getter and a setter into a single thing. However, this raises the question: why not just use a pair (getter, setter)? More precisely, I believe that the types (a -> s, s -> a -> a) and (forall f. Functor f => (s -> f s) -> a -> f a) are isomorphic. Is that right? I see that one advantage of the lens type is that you can use (.) to compose them since they're just functions, but that doesn't bother me much and it seems I could define another operator to compose (getter, setter) lenses and the rest of the machinery would work either way. It's also possible that you can't get the full generality of (forall f. (s -> f t) -> a -> f b) lenses with getter/setter pairs, although I haven't worked through the details yet so don't know either way. So, my question is: what is the advantage of representing lenses in the
Categories: Offsite Discussion

Help finding one-sided ordered list implementation

haskell-cafe - Tue, 01/27/2015 - 7:04pm
In "Numerical Representations as Higher-Order Nested Datatypes", Ralf Hinze describes an implementation of one-sided ordered lists as 2-3 search trees under the left spine view. I'm wondering if anyone's actually coded those up somewhere I could download them, or if I'll have to dig into all the details in the paper and write them myself. Thanks, David
Categories: Offsite Discussion

Looking for feedback - emulating a 1-bit register with a Neural Network

Haskell on Reddit - Tue, 01/27/2015 - 6:39pm

I wrote a bit of background about the project over at /r/compsci: http://www.reddit.com/r/compsci/comments/2tw5vc/emulating_a_1bit_register_with_a_neural_network/

I'm pretty new to Haskell so I'm sure there are lot's of problems. The repo can be found here: https://bitbucket.org/rickdzekman/memnet

I was hoping to get some feedback on my Haskell implementation. Though if you have any general comments about the concept I'm keen to hear them.

Some things that I think are weaknesses:

  • I don't encapsulate the non-termination aspect at all. I've looked at IterT as well as Pipes, but I settled on writing a simple loop.
  • I also wrote my own Semaphores using MVars instead of using something like Pipes.
  • The way inputs/outputs are passed to/from the network is a bit haphazard. I basically take a number, convert it to bits, map those bits to node IDs in the network, and then do the inverse for outputs. It might have been easier to actually use bits, where as I used Bools.
  • I've been told I should look into Vectors rather than Arrays for my mutable state.This proof of concept is so basic that I didn't look at performance at all. But my Node type is so simple, that apparently I should be able to use Unboxed Vectors.

I'm aware the GUI is terrible! I'm on Windows and had heaps of trouble with GTK. When I finally did get it up and running the non-termination of my main program caused GTK to crash. Gloss on the other hand was really simple to setup.

I've had a few raised eyebrows about the way I mix IO and ST. Perhaps it's my imperative background coming through here. But I use an STArray to avoid copying the entire network of Node's each time they pass a message between each other (considering the network is cyclic, it also avoids passing the knot). I then thread the state within an IO loop, extracting output information via runSTArray. That particular bit is here: https://bitbucket.org/rickdzekman/memnet/src/6f3006e39405f91d53dc278dd4d2bc4e2c4a5756/src/AI/MemNet/RunMemNet.hs?at=master

As a Haskell newbie (who also isn't a professional programmer) I would really appreciate some feedback from the more knowledgeable people out there.

submitted by rick_dz
[link] [2 comments]
Categories: Incoming News