News aggregator

Laws of `some` and `many`

Haskell on Reddit - Tue, 10/14/2014 - 10:58am

The documentation of Alternative says:

If defined, some and many should be the least solutions of the equations:

some v = (:) <$> v <*> many v many v = some v <|> pure []

Although this is fine in many cases, and I'm ok with those being default definitions, I don't think that these equations should be considered laws.

For example, I would expect that if X is a Kleene algebra, then Const X is an Alternative. But the Kleene star doesn't necessarily satisfy the characterisation above, hence Const X cannot technically be made a legal Alternative, despite satisfying all the other laws, plus many more, including distributivity.

In optparse-applicative I defined a "free alternative functor with star", which is essentially an Alternative where some is a completely formal operation that satisfies no equations whatsoever.

I can't just use the default definition of some, because that creates an "infinite" structure that cannot be statically analysed.

Now, this type, despite satisfying all the Applicative laws and associativity of <|> is not technically an Alternative because of that silly requirement above, but I would really like to avoid to have to use a different name for the Kleene star operator.

What do people think is the right approach here? I personally think that paragraph in the documentation of Alternative should be removed, as it's too simplistic and rules out interesting instances, but maybe there's a better solution.

submitted by pcapriotti
[link] [16 comments]
Categories: Incoming News

Status of GHC targetting Android on ARM?

Haskell on Reddit - Tue, 10/14/2014 - 10:52am

There have been several posts here and on mailing lists in the past about the GHC targetting Android on ARM. Are there any more recent notes anywhere detailing the build process, any known problems, etc?

submitted by homeopathetic
[link] [14 comments]
Categories: Incoming News

Wiki account creation

haskell-cafe - Tue, 10/14/2014 - 4:31am
Preferred Username is: Looms (I hope I have done this correctly) _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe< at >
Categories: Offsite Discussion

Shake's Internal State

Haskell on Reddit - Tue, 10/14/2014 - 1:04am
Categories: Incoming News

Haskell Introduction - YouTube - Mon, 10/13/2014 - 10:08pm
Categories: Offsite Blogs

online tutorial

haskell-cafe - Mon, 10/13/2014 - 6:13pm
Hello, I want to learn haskell by using the fpcomplete site. Now I wonder if this ( is a good tutorial for a beginner. Roelof
Categories: Offsite Discussion

New Functional Programming Job Opportunities

haskell-cafe - Mon, 10/13/2014 - 5:00pm
Here are some functional programming job opportunities that were posted recently: Functional Software Engineer at Cake Solutions Ltd Software Engineer / Developer at Clutch Analytics/ Windhaven Insurance Cheers, Sean Murphy
Categories: Offsite Discussion

Proposal: Add isSubsequenceOf to Data.List

libraries list - Mon, 10/13/2014 - 4:34pm
Data.List has `subsequences`, calculating all subsequences of a list, but it doesn't provide a function to check whether a list is a subsequence of another list. `isSubsequenceOf` would go into the "Predicates" section ( which already contains: * isPrefixOf (dual of inits) * isSuffixOf (dual of tails) * isInfixOf With this proposal, we would add * isSubsequenceOf (dual of subsequences) Suggested implementation:
Categories: Offsite Discussion

Getting the haddocks back (was: documentation buildfailing in hackage?)

haskell-cafe - Mon, 10/13/2014 - 4:25pm
On Sun, Oct 12, 2014 at 4:50 AM, Mateusz Kowalczyk <fuuzetsu< at >> wrote: I agree! My understanding (which is at /least/ 2nd or 3rd hand) is that the doc builds were turned off intentionally because it was a security issue, and that they are unlikely to come back. Now, assuming that is the case, how can we solve the "there is no documentation" issue? Some ideas to kick-start discussion: - Provide an option to include haddocks in the sdist bundle, and extract them on Hackage for display. - Add a 'cabal uploadHaddock' - Run all haddock builds in a VM/docker container / etc.. that mitigates the security concerns. - ??? --Rogan _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe< at >
Categories: Offsite Discussion

Neil Mitchell: Shake's Internal State

Planet Haskell - Mon, 10/13/2014 - 2:53pm

Summary: Shake is not like Make, it has different internal state, which leads to different behaviour. I also store the state in an optimised way.

Update: I'm keeping an up to date version of this post in the Shake repo, which includes a number of questions/answers at the bottom, and is likely to evolve over time to incorporate that information into the main text.

In order to understand the behaviour of Shake, it is useful to have a mental model of Shake's internal state. To be a little more concrete, let's talk about Files which are stored on disk, which have ModTime value's associated with them, where modtime gives the ModTime of a FilePath (Shake is actually generalised over all those things). Let's also imagine we have the rule:

file *> \out -> do
need [dependency]

So file depends on dependency and rebuilds by executing the action run.

The Make Model

In Make there is no additional state, only the file-system. A file is considered dirty if it has a dependency such that:

modtime dependency > modtime file

As a consequence, run must update modtime file, or the file will remain dirty and rebuild in subsequent runs.

The Shake Model

For Shake, the state is:

database :: File -> (ModTime, [(File, ModTime)])

Each File is associated with a pair containing the ModTime of that file, plus a list of each dependency and their modtimes, all from when the rule was last run. As part of executing the rule above, Shake records the association:

file -> (modtime file, [(dependency, modtime dependency)])

The file is considered dirty if any of the information is no longer current. In this example, if either modtime file changes, or modtime dependency changes.

There are a few consequences of the Shake model:

  • There is no requirement for modtime file to change as a result of run. The file is dirty because something changed, after we run the rule and record new information it becomes clean.
  • Since a file is not required to change its modtime, things that depend on file may not require rebuilding even if file rebuilds.
  • If you update an output file, it will rebuild that file, as the ModTime of a result is tracked.
  • Shake only ever performs equality tests on ModTime, never ordering, which means it generalises to other types of value and works even if your file-system sometimes has incorrect times.

These consequences allow two workflows that aren't pleasant in Make:

  • Generated files, where the generator changes often, but the output of the generator for a given input changes rarely. In Shake, you can rerun the generator regularly, and using a function that writes only on change (writeFileChanged in Shake) you don't rebuild further. This technique can reduce some rebuilds from hours to seconds.
  • Configuration file splitting, where you have a configuration file with lots of key/value pairs, and want certain rules to only depend on a subset of the keys. In Shake, you can generate a file for each key/value and depend only on that key. If the configuration file updates, but only a subset of keys change, then only a subset of rules will rebuild. Alternatively, using Development.Shake.Config you can avoid the file for each key, but the dependency model is the same.

Optimising the Model

The above model expresses the semantics of Shake, but the implementation uses an optimised model. Note that the original Shake paper gives the optimised model, not the easy to understand model - that's because I only figured out the difference a few days ago (thanks to Simon Marlow, Simon Peyton Jones and Andrey Mokhov). To recap, we started with:

database :: File -> (ModTime, [(File, ModTime)])

We said that File is dirty if any of the ModTime values change. That's true, but what we are really doing is comparing the first ModTime with the ModTime on disk, and the list of second ModTime's with those in database. Assuming we are passed the current ModTime on disk, then a file is valid if:

valid :: File -> ModTime -> Bool
valid file mNow =
mNow == mOld &&
and [fst (database d) == m | (d,m) <- deps]
where (mOld, deps) = database file

The problem with this model is that we store each File/ModTime pair once for the file itself, plus once for every dependency. That's a fairly large amount of information, and in Shake both File and ModTime can be arbitrarily large for user rules.

Let's introduce two assumptions:

Assumption 1: A File only has at most one ModTime per Shake run, since a file will only rebuild at most once per run. We use Step for the number of times Shake has run on this project.

Consequence 1: The ModTime for a file and the ModTime for its dependencies are all recorded in the same run, so they share the same Step.

Assumption 2: We assume that if the ModTime of a File changes, and then changes back to a previous value, we can still treat that as dirty. In the specific case of ModTime that would require time travel, but even for other values it is very rare.

Consequence 2: We only use historical ModTime values to compare them for equality with current ModTime values. We can instead record the Step at which the ModTime last changed, assuming all older Step values are unequal.

The result is:

database :: File -> (ModTime, Step, Step, [File])

valid :: File -> ModTime -> Bool
valid file mNow =
mNow == mOld &&
and [sBuild >= changed (database d) | d <- deps]
where (mOld, sBuilt, sChanged, deps) = database file
changed (_, _, sChanged, _) = sChanged

For each File we store its most recently recorded ModTime, the Step at which it was built, the Step when the ModTime last changed, and the list of dependencies. We now check if the Step for this file is greater than the Step at which dependency last changed. Using the assumptions above, the original formulation is equivalent.

Note that instead of storing one ModTime per dependency+1, we now store exactly one ModTime plus two small Step values.

We still store each file many times, but we reduce that by creating a bijection between File (arbitrarily large) and Id (small index) and only storing Id.

Implementing the Model

For those who like concrete details, which might change at any point in the future, the relevant definition is in Development.Shake.Database:

data Result = Result
{result :: Value -- the result when last built
,built :: Step -- when it was actually run
,changed :: Step -- when the result last changed
,depends :: [[Id]] -- dependencies
,execution :: Float -- duration of last run
,traces :: [Trace] -- a trace of the expensive operations
} deriving Show

The differences from the model are:

  • ModTime became Value, because Shake deals with lots of types of rules.
  • The dependencies are stored as a list of lists, so we still have access to the parallelism provided by need, and if we start rebuilding some dependencies we can do so in parallel.
  • We store execution and traces so we can produce profiling reports.
  • I haven't shown the File/Id mapping here - that lives elsewhere.
  • I removed all strictness/UNPACK annotations from the definition above, and edited a few comments.

As we can see, the code follows the optimised model quite closely.

Categories: Offsite Blogs

I'll give a brief talk about Haskell for a group of students, but my code is too slow. How can I improve it, without making it less elegant?

Haskell on Reddit - Mon, 10/13/2014 - 2:40pm

Hello, there is a class assignment on my school where teams must implement a graph library in a language of choice. Most of them are using C/C++/Java, so I find it a great opportunity to talk about Haskell. After all, seeing a problem you struggled to solve in 50 lines of C, in a single line of Haskell, certainly causes an impact. The idea is to define a single graphAlgorithm function, from which every other function on the assignment will be specialized in a single liner. Unfortunately, the performance of such approach will probably make Haskell look bad: calling a BFS takes ~14 seconds, while the same function, as I implemented in JavaScript, takes ~0.5 seconds. That is a 28x slowdown. Of course, the Haskell code is ridiculously high level and clean, whereas my JS implementation shuffled bits for performance. But the audience don't care about that. So, is there any way I can improve the performance of this code without altering it a lot?

module Graph (adjacencyListGraph, breadthFirstSearch) where import qualified Data.List as List import qualified Data.IntSet as Set import qualified Data.PriorityQueue.FingerTree as Queue import Data.Array type Node = Int type Weight = Int type Edge = (Node,Weight) type Graph = Node -> [Edge] data Queue element = Queue { insert :: element -> Queue element, extract :: (element, Queue element), empty :: Bool } stack :: Queue a stack = listToStack [] where listToStack container = Queue { insert = \node -> listToStack (node : container), extract = (head container, listToStack (tail container)), empty = List.null container } adjacencyListGraph :: Array Node [Edge] -> Graph adjacencyListGraph edgesArray node = edgesArray ! node neighbors :: Node -> Graph -> [Node] neighbors node graph = map fst (graph node) graphAlgorithm :: Queue Node -> Graph -> Node -> [Node] graphAlgorithm queue graph node = walk (insert queue node) Set.empty [] where walk queue visited result | empty queue = result | Set.member node visited = walk queueWithoutNode visited result | otherwise = walk queueWithNeighbors (Set.insert node visited) (node : result) where (node,queueWithoutNode) = extract queue queueWithNeighbors = List.foldl' insert queue (neighbors node graph) breadthFirstSearch :: Graph -> Node -> [Node] breadthFirstSearch = graphAlgorithm stack

Code on GitHub - run test.hs.

Profile info.

submitted by SrPeixinho
[link] [25 comments]
Categories: Incoming News

:sprint behaves differently in ghc 7.8.3 ?

Haskell on Reddit - Mon, 10/13/2014 - 1:30pm

In ghci, :sprint does not seem to work anymore:

@arch-docker ~ > ghci GHCi, version 7.8.3: :? for help Prelude> let x = 1 + 2 Prelude> :sprint x x = _ Prelude> x 3 Prelude> :sprint x x = _

I have tried to google about this but could not find any pointer.

By the way, what would be the most appropriate place for this kind of question ?

submitted by pi3r
[link] [4 comments]
Categories: Incoming News

Austin Seipp: The New

Planet Haskell - Mon, 10/13/2014 - 12:10pm

Hello there!

What you're reading is a blog post. Where is it from? It's from What's it doing? It's cataloging the thoughts of the people who run

That's right. This is our new adventure in communicating with you. We wanted some place to put more long-form posts, invite guests, and generally keep people up to date about general improvements (and faults) to our infrastructure, now and in the future. Twitter and a short-form status site aren't so great, and a non-collective mind of people posting scattered things on various sites/lists isn't super helpful for cataloging things.

So for an introduction post, we've really got a lot to talk about...

A quick recap on recent events has had some rough times lately.

About a month and a half ago, we had an extended period of outage, roughly around the weekend of ICFP 2014. This was due to a really large amount of I/O getting backed up on our host machine, rock. rock is a single-tenant, bare-metal machine from Hetzner that we used to host several VMs that comprise the old server set; including the main website, the GHC Trac and git repositories, and Hackage. We alleviated a lot of the load by turning off the hackage server, and migrating one of the VMs to a new hosting provider.

Then, about a week and a half ago, we had another hackage outage that was a result of more meager concerns: disk space usage. Much to my chagrin, this was due in part to an absence of log rotation over the past year, which resulted in a hefty 15GB of text sitting around (in a single file, no less). Oops.

This caused a small bump on the road, which was that the hackage server had a slight error while committing some transactions in the database when it ran out of disk. We recovered from this (thanks to @duncan for the analysis), and restarted it. (We also had point-in-time backups, but in this case it was easier to fix than rollback the whole database).

But we've had several other availability issues beforehand too, including faulty RAM and inconsistent performance. So we're setting out to fix it. And in the process we figured, hey, they'd probably like to hear us babble about a lot of other stuff, too, because why not?

New things

OK, so enough sad news about what happened. Now you're wondering what's going to happen. Most of these happening-things will be good, I hope.

There are a bunch of new things we've done over the past year or so for, so it's best to summarize them a bit. These aren't in any particular order; most of the things written here are pretty new and some are a bit older since the servers have started churning a bit. But I imagine many things will be new to y'all.

A new blog, right here.

And it's a fancy one at that (powered by Phabricator). Like I said, we'll be posting news updates here that we think are applicable for the community at large - but most of the content will focus on the administrative side.

A new hosting provider: Rackspace

As I mentioned earlier this year pending the GHC 7.8 release, Rackspace has graciously donated resources towards for GHC, particularly for buildbots. We had at that time begun using Rackspace resources for hosting resources. Over the past year, we've done so more and more, to the point where we've decided to move all of It became clear we could offer a lot higher reliability and greatly improved services for users, using these resources.

Jesse Noller was my contact point at Rackspace, and has set up for its 2nd year running with free Rackspace-powered machines, storage, and services. That's right: free (to a point, the USD value of which I won't disclose here). With this, we can provide more redundant services both technically and geographically, we can offer better performance, better features and management, etc. And we have their awesome Fanatical Support.

So far, things have been going pretty well. We've migrated several machines to Rackspace, including:

We're still moving more servers, including:

Many thanks to Rackspace. We owe them greatly.

Technology upgrades, increased security, etc etc

We've done several overhauls of the way is managed, including security, our underlying service organization, and more.

  • A better webserver: All of our web instances are now served with nginx where we used Apache before. A large motivation for this was administrative headache, since most of us are much more familiar with nginx as opposed to our old Apache setup. On top of that we get increased speed and a more flexible configuration language (IMO). It also means we have to now run separate proxy servers for nginx, but systems like php-fpm or gunicorn tend to have much better performance and flexibility than things like mod_php anyway.
  • Ubuntu LTS: Almost all of our new servers are running Ubuntu 14.04 LTS. Previously we were running Debian stable, and before Debian announced their LTS project for Squeeze, the biggest motivation was Ubuntu LTSes typically had a much longer lifespan.
  • IPv6 all the way: All of our new servers have IPv6, natively.
  • HTTPS: We've rolled out HTTPS for the large majority of Our servers sport TLS 1.2, ECDHE key exchange, and SPDY v3 with strong cipher suites. We've also enabled HSTS on several of our services (including Phabricator), and will continue enabling it for likely every site we have.
  • Reorganized: We've done a massive reorganization of the server architecture, and we've generally split up services to be more modular, with servers separated in both geographic locations and responsibilities where possible.
  • Consolidation: We've consolidated several of our services too. The biggest change is that we now have a single, consolidated MariaDB 10.0 server powering our database infrastructure. All communications to this server are encrypted with spiped for high security. Phabricator, the wiki, some other minor things (like a blog), and likely future applications will use it for storage where possible too.
  • Improved hardware: Every server now has dedicated network, and servers that are linked together (like buildbots, or databases) are privately networked. All networking operations are secured with spiped where possible.
Interlude: A new Hackage server

While we're on the subject, here's an example of what the new Hackage Server will be sporting:

Old server:

  • 8GB RAM, probably 60%+ of all RAM taken by disk cache.
  • Combined with the hackage-builder process.
  • 1 core.
  • Shared ethernet link amongst multiple VMs (no dedicated QOS per VM, AFAIK). No IPv6.
  • 1x100GB virtual KVM block device backed by RAID1 2x2TB SATA setup on the host.

New server:

  • 4x cores.
  • 4GB RAM, as this should fit comfortably with nginx as a forward proxy.
  • Hackage builder has its own server (removing much of the RAM needs).
  • Dedicated 800Mb/s uplink, IPv6 enabled.
  • Dedicated dual 500GB block devices (backed by dedicated RAID10 shared storage) in RAID1 configuration.

So, Hackage should hopefully be OK for a long time. And, the doc builder is now working again, and should hopefully stay that way too.

Automation: it's a thing

Like many other sites, is big, complicated, intimidating, and there are occasionally points where you find a Grue, and it eats you mercilessly.

As a result, automation is an important part of our setup, since it means if one of us is hit by a bus, people can conceivably still understand, maintain and continue to improve in the future. We don't want knowledge of the servers locked up in anyone's head.

In The Past, Long ago in a Galaxy unsurprisingly similar to this one at this very moment, did not really have any automation. At all, not even to create users. Some of still does not have automation. And even still, in fact, some parts of it are still a mystery to all, waiting to be discovered. That's obviously not a good thing.

Today, has two projects dedicated to automation purposes. These are:

  • Ansible, available in rA, which is a set of Ansible playbooks for automating various aspects of the existing servers.
  • Auron, available in rAUR, is a new, Next Gen™ automation framework, based on NixOS.

We eventually hope to phase out Ansible in favor of Auron. While Auron is still very preliminary, several services have been ported over, and the setup does work on existing providers. Auron also is much more philosophically aligned with our desires for automation, including reproducibility, binary determinism, security features, and more.

More bugs code in the open

In our quest to automate our tiny part of the internet, we've begun naturally writing a bit of code. What's the best thing to do with code? Open source it!

The new haskell-infra organization on GitHub hosts our code, including:

Most of our repositories are hosted on GitHub, and we use our Phabricator for code review and changes between ourselves. (We still accept GitHub pull requests though!) So it's pretty easy to contribute in whatever way you want.

Better DNS and content delivery: CloudFlare & Fastly

We're very recently begun using CloudFlare for for DNS management, DDoS mitigation, and analytics. After a bit of deliberation, we decided that after moving off Hetzner we'd think about a 3rd party provider, as opposed to running our own servers.

We chose CloudFlare mostly because aside from a nice DNS management interface, and great features like AnyCast, we also get analytics and security features, including immediate SSL delivery. And, of course, we get a nice CDN on top for all HTTP content. The primary benefits from CloudFlare are the security and caching features (in that order, IMO). The DNS interface is still particularly useful however; the nameservers should be redundant, and CloudFlare acts more like a reverse proxy as changes are quick and instant.

But unfortunately while CloudFlare is great, it's only a web content proxy. That means certain endpoints which need things like SSH access can not (yet) be reliably proxied, which is one of the major downfalls. As a result, not all of will be magically DDoS/spam resistant, but a much bigger amount of it will be. But the bigger problem is: we have a lot of non-web content!

In particular, none of our Hackage server downloads for example can proxied: Hackage, like most package repositories, merely uses HTTP as a transport layer for packages. In theory you could use a binary protocol, but HTTP has a number of advantages (like firewalls being nice to it). Using a service like CloudFlare for such content is - at the least - a complete violation of the spirit of their service, and just a step beyond that a total violation of their ToS (Section 10). But hackage pushes a few TB a month in traffic - so we have to pay line-rate for that, by the bits. And also, Hackage can't have data usefully mirrored to CDN edges - all traffic has to hop through to the Rackspace DCs, meaning users suffer at the hands of latency and slower downloads.

But that's where Fastly came to the rescue. Fastly also recently stepped up to provide with an Open Source Software discount - meaning we get their awesome CDN for free, for custom services! Hooray!

Since Fastly is a dedicated CDN service, you can realistically proxy whatever you want with it, including our package downloads. With the help of a new friend of ours (@davean), we'll be moving Fastly in front of Hackage soon. Hopefully this just means your downloads and responsiveness will get faster, and we'll use less bandwidth. Everyone wins.

Finally, we're rolling out CloudFlare gradually to new servers to test them and make sure they're ready. In particular, we hope to not disturb any automation as a result of the switch (particularly to new SSL certificates), and also, we want to make sure we don't unfairly impact other people, such as Tor users (Tor/CloudFlare have a contentious relationship - lots of nasty traffic comes from Tor endpoints, but so does a ton of legitimate traffic). Let us know if anything goes wrong.

Better server monitoring: DataDog & Nagios

Server monitoring is a crucial part of managing a set of servers, and unfortunately was quite bad at it before. But not anymore! We've done a lot to try and increase things. Before my time, as far as I know, we pretty much only had some lame mrtg graphs of server metrics. But we really needed something more than that, because it's impossible to have modern infrastructure on that alone.

Enter DataDog. I played with their product last year, and I casually approached them and asked if they would provide an account for - and they did!

DD provides real-time analytics for servers, while providing a lot of custom integrations with services like MySQL, nginx, etc. We can monitor load, networking, and correlate this with things like database or webserver connection count. Events occur from all over On top of that, DD serves as a real-time dashboard for us to organize and comment on events as they happen.

But metrics aren't all we need. There are two real things we need: metrics (point-in-time data), and resource monitoring (logging, daemon watchdogs, resource checks, etc etc).

This is where Nagios comes in - we have it running and monitoring all our servers for daemons, heatlh checks, endpoint checks for connectivity, and more. Datadog helpfully plugins into Nagios, and reports events (including errors), as well as sending us weekly summaries of Nagios reports. This means we can helpfully use the Datadog dashboard as a consolidated piece of infrastructure for metrics and events.

As a result: is being monitored much more closely here on out we hope.

Better log analysis: ElasticSearch

We've (very recently) also begun rolling out another part of the equation: log management. Log management is essential to tracking down big issues over time, and in the past several years, ElasticSearch has become incredibly popular. We have a new ElasticSearch instance, running along with Logstash, which several of our servers now report to (via the logstash-forwarder service, which is lightweight even on smaller servers). Kibana sits in front on a separate server for query management so we can watch the systems live.

Furthermore, our ElasticSearch deployment is, like the rest of our infrastructure, 100% encrypted - Kibana proxies backend ElasticSearch queries through HTTPS and over spiped. Servers dump messages into LogStash over SSL. I would have liked to use spiped for the LogStash connection as well, but SSL is unfortunately mandatory at this time (perhaps for the best).

We're slowly rolling out logstash-forwarder over our new machines, and tweaking our LogStash filters so they can get juicy information. Hopefully our log index will become a core tool in the future.

A new status site

As I'm sure some of you might be aware, we now have a fancy new site,, that we'll be using to post updates about the infrastructure, maintenance windows, and expected (or unexpected!) downtimes. And again, someone came to help us - gave us this for free!

Better server backups

Rackspace also fully supports their backup agents which provide compressed, deduplicated backups for your servers. Our previous situation on Hetzner was a lot more limited in terms of storage and cost. Our backups are stored privately on Cloud Files - the same infrastructure that hosts our static content.

Of course, backup on Rackspace is only one level of redundancy. That's why we're thinking about trying to roll out Tarsnap soon too. But either way, our setup is far more reliable and robust and a lot of us are sleeping easier (our previous backups were space hungry and becoming difficult to maintain by hand.)

GHC: Better build bots, better code review

GHC has for a long time had an open infrastructure request: the ability to build patches users submit, and even patches we write, in order to ensure they do not cause machines to regress. Developers don't necessarily have access to every platform (cross compilers, Windows, some obscurely old Linux machine), so having infrastructure here is crucial.

We also needed more stringent code review. I (Austin) review most of the patches, but ideally we want more people reviewing lots of patches, submitting patches, and testing patches. And we really need ways to test all that - I can't be the bottleneck to test a foreign patch on every machine.

At the same time, we've also had a nightly build infrastructure, but our build infrastructure as often hobbled along with custom code running it (bad for maintenance), and the bots are not directed and off to the side - so it's easy to miss build reports from them.

Enter Harbormaster, our Phabricator-powered buildbot for continuous integration and patch submissions!

Harbormaster is a part of Phabricator, and it runs builds on all incoming patches and commits to GHC. How?

  • First, when a patch or commit for GHC comes in, this triggers an event through a Herald rule. Herald is a Phab application to get notified or perform actions when events arrive. When a GHC commit or patch comes in, a rule is triggered, which begins a build.
  • Our Herald rule runs a build plan, which is a dependency based sequence of actions to run.
  • The first thing our plan does is allocate a machine resource, or a buildbot. It does this by taking a lease on the resource to acquire (non-exclusive) access to it, and it moves forward. Machine management is done by a separate application, Drydock.
  • After leasing a machine, we SSH into it.
  • We then run a build, using our phab-ghc-builder code.
  • Harbormaster tracks all the stdout output, and test results.
  • It then reports back on the Code review, or the commit in question, and emails the author.

This has already lead to a rather large change in development for most GHC developers, and Phabricator is building our patches regularly now - yes, even committers use it!

Harbormaster will get more powerful in the future: our build plans will lease more resources, including Windows, Mac, and different varieties of Linux machines, and it will run more general build plans for cross compilers and other things. It's solved a real problem for us, and the latest infrastructure has been relatively reliable. In fact I just get lazy and submit diffs to GHC without testing them - I let the machines do it. Viva la code review!

(See the GHC wiki for more on our Phabricator process there's a lot written there for GHC developers.)

Phabricator: Documentation, an official wiki, and a better issue tracker

That's right, there's now documentation about the infrastructure, hosted on our new official wiki. And now you can report bugs through Maniphest to us. Both of these applications are powered by Phabricator, just like our blog.

In a previous life, used Request Tracker (RT) to do support management. Our old RT instance is still running, but it's filled with garbage old tickets, some spam, it has its own PostgreSQL instance alone for it (quite wasteful) and generally has not seen active use in years. We've decided to phase it out soon, and instead use our Phabricator instance to manage problems, tickets, and discussions. We've already started importing and rewriting new content into our wiki and modernizing things.

Hopefully these docs will help keep people up to date about the happenings here.

But also, our Phabricator installation has become an umbrella installation for several projects (even the Committee may try to use it for book-keeping). In addition, we've been taking the time to extend and contribute to Phab where possible to improve the experience for users.

In addition to that, we've also authored several Phab extensions:

  • libphutil-haskell in rPHUH, which extends Phabricator with custom support for GHC and other things.
  • libphutil-rackspace in rPHUR, which extends Phabricator with support for Rackspace, including Cloud Files for storage needs, and build-machine allocation for Harbormaster.
  • libphutil-scrypt (deprecated; soon to be upstream) in rPHUSC, which extends Phabricator with password hashing support for the scrypt algorithm.
Future work

Of course, we're not done. That would be silly. Maintaining and providing better services to the community is a real necessity for anything to work at all (and remember: computers are the worst).

We've got a lot further to go. Some sneak peaks...

  • We'll probably attempt to roll out HHVM for our Mediawiki instance to improve performance and reduce load.
  • We'll be creating more GHC buildbots, including a fabled Windows build bot, and on-demand servers for distributed build load.
  • We'll be looking at ways of making it easier to donate to (on the homepage, with a nice embedded donation link).
  • Moar security. I (Austin) in particular am looking into deploying a setup like grsecurity for new servers to harden them automatically.
  • We'll roll out a new server,, that will serve as a powerful, scalable file hosting solution for things like the Haskell Platform or GHC. This will hopefully alleviate administration overhead, reduce bandwidth, and make things quicker (thanks again, Fastly!)

And, of course, we'd appreciate all the help we can get!

el fin

This post was long. This is the ending. You probably won't read it. But we're done now! And I think that's all the time we have for today.

Categories: Offsite Blogs

Kevin Reid (kpreid): Game idea: “Be Consistent”

Planet Haskell - Mon, 10/13/2014 - 11:46am

Here’s another idea for a video game.

The theme of the game is “be consistent”. It's a minimalist-styled 2D platformer. The core mechanic is that whatever you do the first time, the game makes it so that that was the right action. Examples of how this could work:

  • At the start, you're standing at the center of a 2×2 checkerboard of background colors (plus appropriate greebles, not perfect squares). Say the top left and bottom right is darkish and the other quadrants are lightish. If you move left, then the darkish stuff is sky, the lightish stuff is ground, and the level extends to the left. If you move right, the darkish stuff is ground, and the level extends to the right.

  • The first time you need to jump, if you press W or up then that's the jump key, or if you press the space bar then that's the jump key. The other key does something else. (This might interact poorly with an initial “push all the keys to see what they do”, though.)

  • <o>You meet a floaty pointy thing. If you walk into it, it turns out to be a pickup. If you shoot it or jump on it, it turns out to be an enemy.
  • If you jump in the little pool of water, the game has underwater sections or secrets. If you jump over the little pool, water is deadly.

Categories: Offsite Blogs

The New

Haskell on Reddit - Mon, 10/13/2014 - 11:31am
Categories: Incoming News