Here are two simple implementations of Damas-Hindley-Milner type inference.
First, is my Haskell version of a region-based optimized type checker as explained by Oleg Kiselyov, in his excellent review of the optimizations to generalization used in OCaml. Oleg gives an SML implementation, which I’ve Haskellized rather mechanically (using ST instead of mutable references, etc.) The result is a bit ugly, but it does include all the optimizations explained by Oleg above (both lambda-depth / region / level fast generalization and instantiation, plus path compression on linked variables, and not doing expensive occurs checks by delaying them to whenever we traverse the types anyway).
Second, here’s my much shorter and more elegant implementation using the neat unification-fd package by Wren Romano. It’s less optimized though – currently I’m not doing regions or other optimizations. I’m not entirely satisfied with how it looks: I’m guessing this isn’t how the author of unification-fd intended generalization to be implemented, but it works. Generalization does the expensive lookup of free metavariables in the type environment. Also the instantiate function is a bit clunky. The unification-fd package itself is doing inexpensive occurs checks as above, and path compression, but doesn’t provide an easy way to keep track of lambda-depth so I skipped that part. Perhaps the Variable class should include a parameterized payload per variable, which could be used for (among other things) keeping track of lambda-depth.
Once again, the event occurred at the IRILL (Innovation and Research Initiative for Free Software) in Paris, on September 18th to 20th.
The sprint had 7 participants: Danill Frumin, Eric Kow, Florent Becker, Ganesh Sittampalam, Guillaume Hoffmann, Thomas Miedema and Vinh Dang.
Darcs and GHC 8Thomas Miedema is a Haskell and GHC hacker, and came on the first day of the sprint. Since Darcs is a system that aims at supporting the various GHC versions out there, Thomas helped us preparing for GHC 8, the next major version. He explained us one issue of GHC 8 that got triggered by Darcs: a bug with the PatternSynonyms extension. Fortunately it seems that the bug will be fixed in GHC HEAD. (First release candidate is planned for December).
Thomas explaining PatternSynonyms to Eric and GaneshDiving into SelectChanges and PatchChoices codeOn the first day I (Guillaume) claimed the "rollback takes ages" bug, which made me look into SelectChanges and PatchChoices code. The result is that I still haven't yet fixed the bug, but I discovered that patch matching was unnecessarily strict, which I could fix easily. Internally, there are two interesting patch types when it comes to matching:
- NamedPatch: represent the contents of a patch file in _darcs/patches, that is, its info and its contents
- PatchInfoAnd: represents the info of a patch as read from an inventory file (from _darcs/inventories or _darcs/hashed_inventory) and a lazy field to its corresponding NamedPatch.
So, before the sprint, as soon as you wanted to match on a patch file, you had to open (and maybe download) its file, even if this was useless. With my change (mostly in Darcs.Patch.Match) we gained a little more laziness; and the unreasonably slow command "rollback -p ." passes from 2 minutes to ~15 seconds on my laptop. I hope to push this change into Darcs 2.10.2.
Eric, Guillaume and Vinh
Now, the real source of the "rollback -p ." slowness is that patch selection is done on FL's (Forward List), while commands like rollback and obliterate naturally work backwards in time on RL. Currently, an RL is inverted and then given to the patch selection code, which is not convenient at all! Moreover, the actual representation of history of a Darcs repository is (close to being) an RL. So it seems like a proper fix for the bug is to generalize the patch selection code to also work on RL's; which may involve a good amount of typeclass'ing in the relevant modules. I think this will be too big/risky to port to the 2.10 branch, so it will wait for Darcs 2.12.
Ganesh's new not-yet-officially-named stash command
A few days before the sprint, Ganesh unveiled his "stash" branch. It feature a refactoring that enables to suspend patches (ie, put them into a state such that they have no effect in the working copy) but without changing their identity (which is currently what occurs with the darcs rebase command). This enables to implement a git-stash-like feature.
The sprinters (IRL and on IRC) discussed the possible name of the command that should encapsulate this stash feature. More importantly, on the last day we discussed what would be the actual UI of such a feature. As always when a new feature is coming to darcs, we want to make the UI as darcsy as possible :-)
Coming back to the code, Ganesh's refactoring, if extensive, will also simplify the existing types for suspended patches. We decided to go with it.Dan's den
Dan demonstrating den (on the left: Florent)Daniil Frumin was this years Google Summer of Code student for Darcs. Mentored by Ganesh, he brought improvements to Darcsden, many of them being already deployed. Among them, it is possible to launch a local instance of Darcsden (using an executable called den), not unlike Mercurial's "serve" command.
Dan tells more about his work and this sprint in his latest blog post.
A better website and documentationAs a newcomer to the project, Vinh took a look at the documentation, especially the website of the project. He implemented changes to make the front page less intimidating and more organized. He also had a fresh look at our "quickstart" and proposed improvements which we felt were much needed!
Florent's projectsFor this sprint, Florent was more an external visitor than a Darcs hacker. He talked about one of his current projects: Pijul, a version control system with another approach. Check out their website!
Conclusion and the next sprintIn the end this sprint turned out to be more productive and crowded than we initially thought! It has been a lot of time since the previous one, so we had a lot of things to share at first. Sprints do make synchronization between contributors more effective. They are also a moment when we can get more concentrated on the Darcs codebase, and spend more time tacking some issue.
Avenue d'Italie, ParisWe would like to thank the IRILL people for hosting the sprint for the third time and our generous donators to make travelling to sprints easier.
We already have a time and a place for the next sprint: Sevilla, Spain in January 2016! The exact moment will be announced later, but you can already start organizing yourself and tell us if you're going.
Thomas, Eric and GaneshFrom left to right: Vinh, Florent, Dan, Ganesh and Eric
This post is intended to be a short summary of my SoC project, as well as my recent trip to Darcs sprint.Introduction
I am finishing up this post on the train back from the Autumn 2015 Darcs sprint. Today (Sept 20, Sun) was a very fun day full of darcs chatting and coding. By the end of the day we’ve heard a number of presentations
- Ganesh described his work on "stash" command for darcs (naming subject to change!). It involves some refactoring of the rebase code. I hope we would hear more from him on that, because the internal workings are actually quite interesting — I believe it’s the first time singleton types and DataKinds are used in the darcs codebase;
- Florent Becker gave a presentation about Pijul and the theory behind it — A Categorical Theory of Patches by Samuel Mimram and Cinzia Di Giusto, see arXiv:1311.3903;
- Vinh Dang talked about his improvements on the darcs wiki (it’s about time to organize the website), his goal was to make it more accessible to the newcomers;
- Yours truly gave a small presentation, outline of which you will find below:
I have spent this summer hacking on DarcsDen as part of the Google Summer of Code program.
My basic goal was to create a "local" version of darcsden. It was not a trivial task to install darcsden (and probably installation is still not very easy!). It uses a third-party software like Redis and CouchDB. During my coding process I modifed darcsden such that it now can be a good choice for local (or lightweight single user) darcs UI. The local darcsden version can be used without any databases, tracking the repositories in the local file system. This way darcsden can be used by a developer on her local computer, like darcsum, (for working with/comparing repositories) as well as a replacement for darcsweb/cgit — a single user web front for darcs repositories.
Besides that a user of a local version can use darcsden’s interactive UI for recording new patches, as well as a command-line tool den for a quick way of browsing the repositories.
Installing darcsden-local is currently not as easy as I want to it be, but I hope that soon you will be able to install it just by running cabal install darcsden or brew install darcsden. As for now, one could do the following:
- darcs get --lazy http://hub.darcs.net/co-dan/darcsden-local
- cabal install . or stack install
This should install the darcsden binary and all the related css/js files. You can start darcsden by running darcsden --local. If you open your web browser you should see a list of repositories in the current directory.
However, you might have repositories scattered all over the place, and scanning your whole system for darcs repositories is just inefficient. For this purposes darcs keeps a list of repositories in a file inside your ~/.darcs directory. You can manage that list either by hand, or using the command-line den tool:
- den $PATH — add $PATH to the list of repositories in ~/.darcs/darcsden_repos (if it’s not already present there), start darcsden server in the background and launch the browser pointing to $PATH;
- den — the same as den .;
- den --add $PATH — add $PATH to the list of repositories in ~/.darcs/darcsden_repos;
- den --remove $PATH — remove $PATH from the list of repositories in ~/.darcs/darcsden_repos.
In order to further customize darcsden, one can tweak the configuration file located at ~/.darcs/darcsden.conf. Apart from the usual darcsden settings one may pay attention to the following variables:
- homeDir (default .), points to the "root" directory with repositories. If the list file ~/.darcs/darcsden_repos is not present darcsden will recursively search repositories in that directory
- unLocal, pwLocal: the username and the password of the "local" user
The user/password credentials are required for editing the repositories and recording new patches. However, the den binary should automatically pick them up from the config file and log you in.
Once you are logged in, and you have unrecorded changes in the repository, you can use darcsden UI to record a new patch.
Below you can see an example of recording and merging patches from a branch.
Darsden allows you to create forks/branches of your repositories, and it keeps track of the patch dependencies in your branches.
More "internal" changes:
- Instead of having to specify some parts of the configuration in DarcsDen.Settings, darcsden now uses runtime flags: –hub for using hub-specific modifications, –local for using the local backend and no flag for default behaviour
- The flag actually choose what are called instances — something that a bit less fine-grained than settings. Instances allow you to pick backend, overwrite settings, modify the looks of the front page.
- HTTP-testing using wreq. The previous test suite used selenium and it got bit-rotten. The wreq-based is easier to run and perhaps slightly easier to maintain.
- HTTP auth, which is used as part of the local instance; the den tool utilizes it to log the user in automatically.
- Support for repositories inside directories and nested repositories.
- All the backend code that is used for handling repositories and meta-data on the file system.
- Functionality for downloading zipped dist archives of darcs repositories.
- Assorted mini-fixes
We have also discussed the possibility of adding rewrite rules implementing short-cut fusion for the directed types in Darcs. In order to see if it’s really worth it we would have to bring back to life the benchmarking suite (or at least check on it!).
It was a really exciting weekend for me and I was delighted to meet some of my IRC friends. As it turns out, it is a small world and despite being from different parts of it we have a bunch of common IRL friends, professors. As the French would (probably not) say, très bien. The next darcs sprint will probably be in January, and probably in Europe, again.
Tagged: darcs, haskell
I recently wrote a new Stack feature and blogged about it. The feature is about adding support for PVP bounds in cabal files. I did my very best to avoid stirring up anything controversial. But as is usual when the PVP comes up, a disagreement broke out on Reddit about version bounds. This essentially comes down to two different ways to view an upper bound on a package:
- We've tested, and know for certain that the new version of a dependency is incompatible with our package
- I can't guarantee that any new versions of a dependency will be compatible with our package
If you look through the history of PVP debates, you'll see that this argument comes up over and over again. I'm going to make a bold statement: if a core feature to how we do package management is so easily confused, there's a problem with how we're using the feature. I made an offhand comment about this on Twitter:
Instead of cabal file version ranges, we should have a set of "built with" versions per package. Let tooling generate and interpret the data— Michael Snoyman (@snoyberg) September 24, 2015 <script async="async" charset="utf-8" src="http://platform.twitter.com/widgets.js"></script>
Based on the positive feedback to that tweet, I'm moving ahead with making a more official proposal.How PVP bounds work today
Here's the theory of how you're supposed to write PVP bounds in a file:
- Test your package against a range of dependencies. For example, let's say we tested with text-1.1.2 and text-18.104.22.168
- Modify the build-depends in your .cabal file to say text >= 1.1.2 && < 1.3, based on the fact that it's known to work with at least version 1.1.2, and unknown to work on anything than major version 1.2.
- Next time you make a release, go through this whole process again.
PVP detractors will respond with a few points:
- You don't know that your package won't work with text-1.1.1
- You don't know that your package won't work with text-1.3
- You don't know for certain that your package will work with text-1.1.3, text-22.214.171.124, or text-1.2.1 (yes, it should based on PVP rules, but mistakes can happen.
- Repeating this testing/updating process manually each time you make a code change is tedious and error-prone.
If you notice, what we did in the above was extract the cabal file's metadata (version bounds) from what we actually know (known versions that the package works with). I'm going to propose a change: let's capture that real information, instead of the proxy data. The data could go into the cabal file itself, a separate metadata file, or a completely different database. In fact, the data doesn't even need to be provided by the package author. Stackage Nightly, for instance, would be a wonderful source of this information.
A dependency solver - perhaps even cabal-install's - would then be able to extract exactly the same version bound information we have today from this data. We could consider it an intersection between the .cabal file version bounds and the external data. Or, we could ignore .cabal file bounds and jump straight to the database. Or we could even get more adventurous, e.g. preferring known-good build plans (based on build plan history).
In theory, this functionality - if done as a separate database from the .cabal files themselves - means that on-Hackage revisions would be much less important, possibly even a feature that no one needs in the future.Automation!
And here's the best part: this doesn't require authors to do anything. We can automate the entire process. There could even be build servers sitting and churning constantly trying to find combinations that build together. We've seen already how difficult it is to get authors to adopt a policy. The best policy is one that can be automated and machine run.Downsides
Problems I've thought of so far:
- Some packages (notably Edward Kmett's) have a versioning scheme which expresses more information than the PVP itself, and therefore the generated version bounds from this scheme may be too strict. But that won't necessarily be a problem, since a build server will be able to just test new versions as they come out.
- This very blog post may start a flame war again, which I sincerely hope doesn't happen.
In order for this to really happen, we need:
- Broad support for the idea
- Changes to the cabal-install dependency solver (or an alternate replacement)
- Central infrasturcture for tracking the build successes
- Tooling support for generating the build success information
And to be blunt: this is not a problem that actually affects me right now, or most people I work with (curation is in a sense a simplified version of this). If the response is essentially "don't want it," let's just drop it. But if people relying on version bounds today think that this may be a good path forward, let's pursue it.
We have 'first' in Arrow, baked for tuple. And we have 'first' in Bifunctor, that bakes for function. Is there one / is it practical to have which generalizes on both?submitted by literon
[link] [13 comments]
In a few weeks, I will have the opportunity to offer a weekend workshop to selected and motivated high school students1 to a topic of my choice. My idea is to tell them something about logic, proofs, and the joy of searching and finding proofs, and the gratification of irrevocable truths.
While proving things on paper is already quite nice, it is much more fun to use an interactive theorem prover, such as Isabelle, Coq or Agda: You get immediate feedback, you can experiment and play around if you are stuck, and you get lots of small successes. Someone2 once called interactive theorem proving “the worlds most geekiest videogame”.
Unfortunately, I don’t think one can get high school students without any prior knowledge in logic, or programming, or fancy mathematical symbols, to do something meaningful with a system like Isabelle, so I need something that is (much) easier to use. I always had this idea in the back of my head that proving is not so much about writing text (as in “normally written” proofs) or programs (as in Agda) or labeled statements (as in Hilbert-style proofs), but rather something involving facts that I have proven so far floating around freely, and way to combine these facts to new facts, without the need to name them, or put them in a particular order or sequence. In a way, I’m looking for labVIEW wrestled through the Curry-Horward-isomorphism. Something like this:
This interactive theorem prover allows you to do perform proofs purely by dragging blocks (representing proof steps) onto the paper and connecting them properly. There is no need to learn syntax, and hence no frustration about getting that wrong. Furthermore, it comes with a number of example tasks to experiment with, so you can simply see it as a challenging computer came and work through them one by one, learning something about the logical connectives and how they work as you go.
For the actual workshop, my plan is to let the students first try to solve the tasks of one session on their own, let them draw their own conclusions and come up with an idea of what they just did, and then deliver an explanation of the logical meaning of what they did.
The implementation is heavily influenced by Isabelle: The software does not know anything about, say, conjunction (∧) and implication (→). To the core, everything is but an untyped lambda expression, and when two blocks are connected, it does unification4 of the proposition present on either side. This general framework is then instantiated by specifying the basic rules (or axioms) in a descriptive manner. It is quite feasible to implement other logics or formal systems on top of this as well.
Another influence of Isabelle is the non-linear editing: You neither have to create the proof in a particular order nor have to manually manage a “proof focus”. Instead, you can edit any bit of the proof at any time, and the system checks all of it continuously.
Obviously, there is still plenty that can be done to improve the machine. In particular, the ability to create your own proof blocks, such as proof by contradiction, prove them to be valid and then use them in further proofs, is currently being worked on. And while the page will store your current progress, including all proofs you create, in your browser, it needs better ways to save, load and share tasks, blocks and proofs. Also, we’d like to add some gamification, i.e. achievements (“First proof by contradiction”, “50 theorems proven”), statistics, maybe a “share theorem on twitter” button. As the UI becomes more complicated, I’d like to investigating moving more of it into Haskell world and use Functional Reactive Programming, i.e. Ryan Trickle’s reflex, to stay sane.
Customers who liked The Incredible Proof Machine might also like these artifacts, that I found while looking whether something like this exists:
- Easyprove, an interactive tool to create textual proofs by clicking on rules.
- Domino On Acid represents natural deduction rules in propositional logic with → and ⊥ as a game of dominoes.
- Proofscape visualizes the dependencies between proofs as graphs, i.e. it operates on a higher level than The Incredible Proof Machine.
- Proofmood is a nice interactive interface to conduct proofs in Fitch-style.
- Proof-Game represents proofs trees in a sequent calculus with boxes with different shapes that have to match.
- JAPE is an editor for proofs in a number of traditional proof styles. (Thanks to Alfio Martini for the pointer.)
- Logitext, written by Edward Z. Yang, is an online tool to create proof trees in sequent style, with a slick interface, and is even backed by Coq! (Thanks to Lev Lamberov for the pointer.)
- Carnap is similar in implementation to The Incredible Proof Machine (logical core in Haskell, generic unification-based solver). It currently lets you edit proof trees, but there are plans to create something more visual.
- Clickable Proofs is a (non-free) iOS app that incorporates quite a few of the ideas that are behind The Incredible Proof Machine. It came out of a Bachelor’s thesis of Tim Selier and covers propositional logic.
- Euclid the game by Kasper Peulen is a nice game to play with geometric constructions.
Does anyone know the reference?↩
We almost named it “Proofcraft”, which would be a name our current Minecraft-wild youth would appreciate, but it is alreay taken by Gerwin Kleins blog. Also, the irony of a theorem prover being in-credible is worth something.↩
Luckily, two decades ago, Tobias Nipkow published a nice implementation of higher order pattern unification as ML code, which I transliterated to Haskell for this project.↩
Hi guys! So I'm trying to learn Haskell and I'm having a tough time :( so what I want to do is take an input file and list the words in alphabetical order with the line number attached (Must not include numbers or if a word has a number at the end it should be removed) This is what I have so far: reateIndex:: String -> [(String, Int)]
createIndex str = sortBy compare. concat . zipWith(\n xs -> [(x,n)|x<-xs])[1..] . map words . lines $ str
it gives me a list of the words in ASCII order but includes the numbers :( Any advice? Thanks in advance!!
p.s if I have a file of: I am what I
am, and I (765) do not like Spam. Abc2. Abc3.
I should get:
what 1submitted by THE_PANDA_EXPRESS
[link] [2 comments]