News aggregator

Mark Jason Dominus: Want to work with me on one of these projects?

Planet Haskell - Wed, 05/13/2015 - 11:53am

I did a residency at the Recurse Center last month. I made a profile page on their web site, which asked me to list some projects I was interested in working on while there. Nobody took me up on any of the projects, but I'm still interested. So if you think any of these projects sounds interesting, drop me a note and maybe we can get something together.

They are listed roughly in order of their nearness to completion, with the most developed ideas first and the vaporware at the bottom. I am generally language-agnostic, except I refuse to work in C++.

Or if you don't want to work with me, feel free to swipe any of these ideas yourself. Share and enjoy.

Linogram

Linogram is a constraint-based diagram-drawing language that I think will be better than prior languages (like pic, Metapost, or, god forbid, raw postscript or SVG) and very different from WYSIWYG drawing programs like Inkscape or Omnigraffle. I described it in detail in chapter 9 of Higher-Order Perl and it's missing only one or two important features that I can't quite figure out how to do. It also needs an SVG output module, which I think should be pretty simple.

Most of the code for this already exists, in Perl.

I have discussed Linogram previously in this blog.

Orthogonal polygons  

Each angle of an orthogonal polygon is either 90° or 270°. All 4-sided orthogonal polygons are rectangles. All 6-sided orthogonal polygons are similar-looking letter Ls. There are essentially only four different kinds of 8-sided orthogonal polygons. There are 8 kinds of 10-sided orthogonal polygons:

There are 29 kinds of 12-sided orthogonal polygons. I want to efficiently count the number of orthogonal polygons with N sides, and have the computer draw exemplars of each type.

I have a nice method for systematically generating descriptions of all simple orthogonal polygons, and although it doesn't scale to polygons with many sides I think I have an idea to fix that, making use of group-theoretic (mathematical) techniques. (These would not be hard for anyone to learn quickly; my ten-year-old daughter picked them right up. Teaching the computer would be somewhat trickier.) For making the pictures, I only have half the ideas I need, and I haven't done the programming yet.

The little code I have is written in Perl, but it would be no trouble to switch to a different language.

[ Addendum 20150607: the orthogonal polygon sequence is now in OEIS! ]

Simple Android app

I want to learn to build Android apps for my Android phone. I think a good first project would be a utility where you put in a sequence of letters, say FBS, and it displays all the words that contain those letters in order. (For FBS the list contains "afterburners", "chlorofluorocarbons", "fables", "fabricates", …, "surfboards".) I play this game often with my kid (the letters are supplied by license plates we pass) and we want a way to cheat when we are stumped.

My biggest problem with Android development in the past has been getting the immense Android SDK set up.

The project would need to be done in Java, because that is what Android uses.

gi

Git is great, but its user interface is awful. The command set is obscure and non-orthogonal. Error messages are confusing. gi is a thinnish layer that tries to present a more intuitive and uniform command set, with better error messages and clearer advice, without removing any of git's power.

There's no code written yet, and we could do it in any language. Perl or Python would be good choices. The programming is probably easy; the hard part of this project is (a) design and (b) user testing.

I have a bunch of design notes written up about this already.

Twingler

Twingler takes an example of an input data structure and and output data structure, and writes code in your favorite language for transforming the input into the output. Or maybe it takes some sort of simplified description of what is wanted and writes the code from that. The description would be declarative, not procedural. I'm really not at all sure what it should do or how it should work, but I have a lot of notes, and if we could make it happen a lot of people would love it.

No code is written; we could do this in your favorite language. Haskell maybe?

Bonus: Whatever your favorite language is, I bet it needs something like this.

Crapspad

I want a simple library that can render simple pixel graphics and detect and respond to mouse events. I want people to be able to learn to use it in ten minutes. It should be as easy as programming graphics on an Apple II and easier than a Commodore 64. It should not be a gigantic object-oriented windowing system with widgets and all that stuff. It should be possible to whip up a simple doodling program in Crapspad in 15 minutes.

I hope to get Perl bindings for this, because I want to use it from Perl programs, but we could design it to have a language-independent interface without too much trouble.

Git GUI

There are about 17 GUIs for Git and they all suck in exactly the same way: they essentially provide a menu for running all the same Git commands that you would run at the command line, obscuring what is going on without actually making Git any easier to use. Let's fix this.

For example, why can't you click on a branch and drag it elsewhere to rebase it, or shift-drag it to create a new branch and rebase that? Why can't you drag diff hunks from one commit to another?

I'm not saying this stuff would be easy, but it should be possible. Although I'm not convinced I really want to put ion the amount of effort that would be required. Maybe we could just submit new features to someone else's already-written Git GUI? Or if they don't like our features, fork their project?

I have no code yet, and I don't even know what would be good to use.

Categories: Offsite Blogs

Functional Jobs: Senior Software Engineer (Functional Programming/Scala) at AdAgility (Full-time)

Planet Haskell - Wed, 05/13/2015 - 10:14am

Who we are

AdAgilityTM cross-promotes thousands of offers everyday, driving incremental revenue and margin to clients while delighting our customers. The AdAgility Platform supports both first and third-party offer delivery, to help monetization experts, ecommerce managers, and partnership teams power cross-sell offers with minimal technical effort. A single line of code allows for the secure and efficient delivery of relevant offers, with 24/7 access to full-funnel analytics and powerful real-time offer administration.

As a key member of our engineering team, you will be at the forefront of building out the decisioning and analytics capabilities that deliver fantastic results for our clients.

Location

Our main street office is a 5 minute walk from the Waltham commuter rail station. There is plenty of (free) parking onsite and it is walking distance to a bevy of restaurants.

This position requires presence in the office on a weekly basis.

The Stack

Our goal is to maximize the amount of time we spend building and releasing great software. The technology choices we have made so far reflect this view.

  • Our data processing and decisioning systems are written in Scala
  • The dashboard is a Ruby on Rails application running against a Postgres cluster
  • All of our deployments are automated with Ansible
  • We run on AWS services such as EC2, Kinesis, Elasticache, S3, ELB and Route 53
  • Libraries we are using include Scalaz, Spray, Akka and Kamon
  • We leverage light-weight, saas monitoring tools to know exactly what’s going on 24/7: StackDriver, Cronitor, Logentries, Rollbar, …

What you really need

  • Proficiency with two jvm languages (Scala, Clojure, Java, …)
  • Ability to work independently but wisely - if you are stuck on something, don’t wait a week to bounce ideas off a colleague
  • Experience in functional programming (Haskell, Erlang, Lisp, ML, Scala, etc) - open source or significant online coursework counts
  • Knowledge of the various forms of automated testing - this is not the wild west
  • Battle scars from building systems (this is not measured in years of experience)
  • A burning desire to build software that produces real value for your team

The cherry on top (nice to haves)

  • Experience with Ansible or similar deployment automation tools
  • You’ve dabbled with Angular, Node.js or Rails
  • A knack for front-end design
  • A good understanding of micro-services and when it makes sense to use them

What we value

  • Honesty
  • Dependability
  • A positive attitude
  • Having fun
  • Checking your ego at the door

Company-wide benefits

At AdAgility, we strive to create a fun, low stress environment that is conducive for success. Some of our top benefits include:

  • 100% company paid health + dental insurance
  • Customize your work station. Standing desk - no problem!
  • Flexible hours
  • A weekly, company-wide work-from-home day
  • Regular team outings (recent ones include F1 racing, a Bruins game and laser tag)
  • Feeling tired in the afternoon? Take a cat nap on our sectional.

Get information on how to apply for this position.

Categories: Offsite Blogs

Functional Jobs: Senior Systems Engineer (Scala) at AdAgility (Full-time)

Planet Haskell - Wed, 05/13/2015 - 10:14am

Who we are

AdAgilityTM cross-promotes thousands of offers everyday, driving incremental revenue and margin to clients while delighting our customers. The AdAgility Platform supports both first and third-party offer delivery, to help monetization experts, ecommerce managers, and partnership teams power cross-sell offers with minimal technical effort. A single line of code allows for the secure and efficient delivery of relevant offers, with 24/7 access to full-funnel analytics and powerful real-time offer administration.

As a key member of our engineering team, you will be at the forefront of building out the decisioning and analytics capabilities that deliver fantastic results for our clients.

Location

Our main street office is a 5 minute walk from the Waltham commuter rail station. There is plenty of (free) parking onsite and it is walking distance to a bevy of restaurants.

This position requires presence in the office on a weekly basis.

The Stack

Our goal is to maximize the amount of time we spend building and releasing great software. The technology choices we have made so far reflect this view.

  • Our data processing and decisioning systems are written in Scala
  • The dashboard is a Ruby on Rails application running against a Postgres cluster
  • All of our deployments are automated with Ansible
  • We run on AWS services such as EC2, Kinesis, Elasticache, S3, ELB and Route 53
  • Libraries we are using include Scalaz, Spray, Akka and Kamon
  • We leverage light-weight, saas monitoring tools to know exactly what’s going on 24/7: StackDriver, Cronitor, Logentries, Rollbar, …

What you really need

  • Proficiency with two jvm languages (Scala, Clojure, Java, …)
  • Ability to work independently but wisely - if you are stuck on something, don’t wait a week to bounce ideas off a colleague
  • Experience in functional programming (Haskell, Erlang, Lisp, ML, Scala, etc) - open source or significant online coursework counts
  • Knowledge of the various forms of automated testing - this is not the wild west
  • Battle scars from building systems (this is not measured in years of experience)
  • A burning desire to build software that produces real value for your team

The cherry on top (nice to haves)

  • Experience with Ansible or similar deployment automation tools
  • You’ve dabbled with Angular, Node.js or Rails
  • A knack for front-end design
  • A good understanding of micro-services and when it makes sense to use them

What we value

  • Honesty
  • Dependability
  • A positive attitude
  • Having fun
  • Checking your ego at the door

Company-wide benefits

At AdAgility, we strive to create a fun, low stress environment that is conducive for success. Some of our top benefits include:

  • 100% company paid health + dental insurance
  • Customize your work station. Standing desk - no problem!
  • Flexible hours
  • A weekly, company-wide work-from-home day
  • Regular team outings (recent ones include F1 racing, a Bruins game and laser tag)
  • Feeling tired in the afternoon? Take a cat nap on our sectional.

Get information on how to apply for this position.

Categories: Offsite Blogs

Bryan O'Sullivan: Sometimes, the old ways are the best

Planet Haskell - Wed, 05/13/2015 - 10:13am

Over the past few months, the Sigma engineering team at Facebook has rolled out a major Haskell project: a rewrite of Sigma, an important weapon in our armory for fighting spam and malware.

Sigma has a mission-critical job, and it needs to scale: its growing workload currently sees it handling tens of millions of requests per minute.

The rewrite of Sigma in Haskell, using the Haxl library that Simon Marlow developed, has been a success. Throughput is higher than under its predecessor, and CPU usage is lower. Sweet!

Nevertheless, success brings with it surprises, and even though I haven’t worked on Sigma or Haxl, I’ve been implicated in one such surprise. To understand my accidental bit part in the show, let's begin by mentioning that Sigma uses JSON internally for various purposes. These days, the Haskell-powered Sigma uses aeson, the JSON library I wrote, to handle JSON data.

A few months ago, the Haxl rewrite of Sigma was going through an episode of crazytown, in which it would intermittently and unpredictably use huge amounts of CPU and memory. The culprit turned out to be JSON strings containing zillions of backslashes. (I have no idea why. If you’ve worked with large volumes of data for a long time, you won’t even bat an eyelash at the idea that a data store somewhere contains some really weird records.)

The team quickly mitigated the problem, and gave me a nudge that I might want to look into the problem. On Sunday evening, with a glass of red wine in hand, I finally dove in to see what was wrong.

Since the Sigma developers had figured out what was causing these time and space explosions, I immediately had a test case to work with, and the results were grim: decoding a mere megabyte of continuous backslashes took over a second, consumed over a gigabyte of memory, and killed concurrency by causing the runtime system to spend almost 90% of its time in the garbage collector. Yikes!

Whatever was going on? If you look at the old implementation of aeson’s unescape function, it seems quite efficient and innocuous. It’s reasonably tightly optimized low-level Haskell.

Trouble is, unescape uses an API (a bytestring builder) that is intended for streaming a result incrementally. Unfortunately the unescape function can’t hand any data back to its caller until it has processed an entire string.

The result is as you’d expect: we build a huge chain of thunks. In this case, the thunks will eventually write data efficiently into buffers. Alas, the thunks have nobody demanding the evaluation of their contents. This chain consumes a lot (a lot!) of memory and incurs a huge amount of GC overhead (long chains of thunks are expensive). Sadness ensues.

The “old ways” in the title refer to the fix: in place of a fancy streaming API, I simply allocate a single big buffer and blast the bytes straight into it.

For that pathological string with almost a megabyte of consecutive backslashes, the new implementation is 27x faster and uses 42x less memory, all for the cost of perhaps an hour of Sunday evening hacking (including a little enabling work that incidentally illustrates just how easy it is to work with monad transformers). Not bad!

Categories: Offsite Blogs

Neural networks and fast automatic differentiation?

Haskell on Reddit - Wed, 05/13/2015 - 10:12am

I've been playing around with neural networks recently, and as a haskell enthusiast i was very happy to see the ad package.

Playing around with the thought of neural networks in haskell turned out quite nice:

Essentially you can describe a net as a function from a structure of parameters to something (usually a function). As long as it can be turned into some error in the end, the function can be trained.

This is just DownStar f a b, where f is the structure of parameters, a the type of the parameters, and b the output.

This works beautifully in theory, but is terribly slow and memory inefficient in practice. How would one go about improving performance, while still keeping some generality?

There's some not too well documented code below, where Data.Net.hs explains the general idea: https://github.com/Apsod/fixed

submitted by apsod
[link] [8 comments]
Categories: Incoming News

PHP as compiler target?

Haskell on Reddit - Wed, 05/13/2015 - 5:17am

Just an idea I recently had and would like to hear your opinion about:

One of the reasons PHP is so popular is, that it is available even with the most cheap webspace/domain providers. I don't know if this is possible at all, but would it make sense to develop something that compiles Haskell code to PHP? Sure, the performance would suffer, but in many beginner projects this is not an issue.

If such a feature would work nicely, I probably would use it. What do you think? :)

submitted by Dobias
[link] [33 comments]
Categories: Incoming News

Fedora ghc-7.10.1 repo

glasgow-user - Wed, 05/13/2015 - 4:28am
Hi, It is a bit later than I wanted but I have prepared a Fedora Copr repo for ghc-7.10.1. https://copr.fedoraproject.org/coprs/petersen/ghc-7.10.1/ (note the EPEL7 and F20 builds are currently quick builds and the F21+ builds are perf builds) I will probably add a cabal-install build soon. Let me know if you find any problems or if it is useful. :) Jens
Categories: Offsite Discussion

Review my introduction to Glance, a new visual representation of Haskell.

Haskell on Reddit - Wed, 05/13/2015 - 12:02am

Link to the Google Doc

Use the Google Doc comments for suggestions, questions, and confusions about specific parts of the document. Comment here on Reddit if you want to discuss Glance or visual languages in general.

Edit: Thank you everyone who has commented and made suggestions. This has already been very helpful.

submitted by RobbieGleichman
[link] [18 comments]
Categories: Incoming News

evaluate arbitrary code + mutable variables

haskell-cafe - Tue, 05/12/2015 - 9:19pm
I wrote a graph-like data type, with some functions for adding statements and relationships between statements. (It's different from a graph in that there's no edge/vertex distinction; there are only statements, but a statement might refer to other statements.) Currently I'm using them in GHCI. Because it does not allow one to change a variable, I keep having to do things like this: let g0 = emptyDocument let s1 = newStatement ... let g1 = addStatement s1 g0 let s2 = newStatement ... let g2 = addStatement s2 g1 ... If I wrote a standalone application, I could use mutable variables, so I would not have to define a new object every time I want to modify an existing one. However I like being able to type in arbitrary code into GHCI. Can one have both of those at once? That is, could I either (1) use MVars from within GHCI, or (2) write a standalone app that lets the user evaluate arbitrary Haskell? _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe< at >haskell.org h
Categories: Offsite Discussion

Parsing with Haskell

Haskell on Reddit - Tue, 05/12/2015 - 7:45pm
Categories: Incoming News

Brent Yorgey: Pan-Galactic Division in Haskell

Planet Haskell - Tue, 05/12/2015 - 7:07pm

Summary: given an injective function , it is possible to constructively “divide by ” to obtain an injection , as shown recently by Peter Doyle and Cecil Qiu and expounded by Richard Schwartz. Their algorithm is nontrivial to come up with—this had been a longstanding open question—but it’s not too difficult to explain. I exhibit some Haskell code implementing the algorithm, and show some examples.

Introduction: division by two

Suppose someone hands you the following:

  • A Haskell function f :: (A, Bool) -> (B, Bool), where A and B are abstract types (i.e. their constructors are not exported, and you have no other functions whose types mention A or B).

  • A promise that the function f is injective, that is, no two values of (A, Bool) map to the same (B, Bool) value. (Thus (B, Bool) must contain at least as many inhabitants as (A, Bool).)

  • A list as :: [A], with a promise that it contains every value of type A exactly once, at a finite position.

Can you explicitly produce an injective function f' :: A -> B? Moreover, your answer should not depend on the order of elements in as.

It really seems like this ought to be possible. After all, if (B, Bool) has at least as many inhabitants as (A, Bool), then surely B must have at least as many inhabitants as A. But it is not enough to reason merely that some injection must exist; we have to actually construct one. This, it turns out, is tricky. As a first attempt, we might try f' a = fst (f (a, True)). That is certainly a function of type A -> B, but there is no guarantee that it is injective. There could be a1, a2 :: A which both map to the same b, that is, one maps to (b, False) and the other to (b, True). The picture below illustrates such a situation: (a1, True) and (a2, True) both map to b2. So the function f may be injective overall, but we can’t say much about f restricted to a particular Bool value.

The requirement that the answer not depend on the order of as also makes things difficult. (Over in math-land, depending on a particular ordering of the elements in as would amount to the well-ordering principle, which is equivalent to the axiom of choice, which in turn implies the law of excluded middle—and as we all know, every time someone uses the law of excluded middle, a puppy dies. …I feel like I’m in one of those DirecTV commercials. “Don’t let a puppy die. Ignore the order of elements in as.”) Anyway, making use of the order of values in as, we could do something like the following:

  • For each a :: A:
    • Look at the B values generated by f (a,True) and f (a,False). (Note that there might only be one distinct such B value).
    • If neither B value has been used so far, pick the one that corresponds to (a,True), and add the other one to a queue of available B values.
    • If one is used and one unused, pick the unused one.
    • If both are used, pick the next available B value from the queue.

It is not too hard I couldn’t be bothered to show that this will always successfully result in a total function A -> B, which is injective by construction. (One has to show that there will always be an available B value in the queue when you need it.) The only problem is that the particular function we get depends on the order in which we iterate through the A values. The above example illustrates this as well: if the A values are listed in the order , then we first choose , and then . If they are listed in the other order, we end up with and . Whichever value comes first “steals” , and then the other one takes whatever is left. We’d like to avoid this sort of dependence on order. That is, we want a well-defined algorithm which will yield a total, injective function A -> B, which is canonical in the sense that the algorithm yields the same function given any permutation of as.

It is possible—you might enjoy puzzling over this a bit before reading on!

Division by N

The above example is a somewhat special case. More generally, let denote a canonical finite set of size , and let and be arbitrary sets. Then, given an injection , is it possible to effectively (that is, without excluded middle or the axiom of choice) compute an injection ?

Translating down to the world of numbers representing set cardinalities—natural numbers if and are finite, or cardinal numbers in general—this just says that if then . This statement about numbers is obviously true, so it would be nice if we could say something similar about sets, so that this fact about numbers and inequalities can be seen as just a “shadow” of a more general theorem about sets and injections.

As hinted in the introduction, the interesting part of this problem is really the word “effectively”. Using the Axiom of Choice/Law of Excluded Middle makes the problem a lot easier, but either fails to yield an actual function that we can compute with, instead merely guaranteeing the existence of such a function, or gives us a function that depends on a particular ordering of .

Apparently this has been a longstanding open question, recently answered in the affirmative by Peter Doyle and Cecil Qiu in their paper Division By Four. It’s a really great paper: they give some fascinating historical context for the problem, and explain their algorithm (which is conceptually not all that difficult) using an intuitive analogy to a card game with certain rules. (It is not a “game” in the usual sense of having winners and losers, but really just an algorithm implemented with “players” and “cards”. In fact, you could get some friends together and actually perform this algorithm in parallel (if you have sufficiently nerdy friends).) Richard Schwartz’s companion article is also great fun and easy to follow (you should read it first).

A Game of Thrones Cards

Here’s a quick introduction to the way Doyle, Qiu, and Schwartz use a card game to formulate their algorithm. (Porting this framework to use “thrones” and “claimants” instead of “spots” and “cards” is left as an exercise to the reader.)

The finite set is to be thought of as a set of suits. The set will correspond to a set of players, and to a set of ranks or values (for example, Ace, 2, 3, …) In that case corresponds to a deck of cards, each card having a rank and a suit; and we can think of in terms of each player having in front of them a number of “spots” or “slots”, each labelled by a suit. An injection is then a particular “deal” where one card has been dealt into each of the spots in front of the players. (There may be some cards left over in the deck, but the fact that the function is total means every spot has a card, and the fact that it is injective is encoded in the common-sense idea that a given card cannot be in two spots at once.) For example, the example function from before:

corresponds to the following deal:

Here each column corresponds to one player’s hand, and the rows correspond to suit spots (with the spade spots on top and the heart spots beneath). We have mapped to the ranks A, 2, 3, and mapped T and F to Spades and Hearts respectively. The spades are also highlighted in green, since later we will want to pay particular attention to what is happening with them. You might want to take a moment to convince yourself that the deal above really does correspond to the example function from before.

A Haskell implementation

Of course, doing everything effectively means we are really talking about computation. Doyle and Qiu do talk a bit about computation, but it’s still pretty abstract, in the sort of way that mathematicians talk about computation, so I thought it would be interesting to actually implement the algorithm in Haskell.

The algorithm “works” for infinite sets, but only (as far as I understand) if you consider some notion of transfinite recursion. It still counts as “effective” in math-land, but over here in programming-land I’d like to stick to (finitely) terminating computations, so we will stick to finite sets and .

First, some extensions and imports. Nothing too controversial.

> {-# LANGUAGE DataKinds #-} > {-# LANGUAGE GADTs #-} > {-# LANGUAGE GeneralizedNewtypeDeriving #-} > {-# LANGUAGE KindSignatures #-} > {-# LANGUAGE RankNTypes #-} > {-# LANGUAGE ScopedTypeVariables #-} > {-# LANGUAGE StandaloneDeriving #-} > {-# LANGUAGE TypeOperators #-} > > module PanGalacticDivision where > > import Control.Arrow (second, (&&&), (***)) > import Data.Char > import Data.List (find, findIndex, transpose) > import Data.Maybe > > import Diagrams.Prelude hiding (universe, value) > import Diagrams.Backend.Rasterific.CmdLine > import Graphics.SVGFonts

We’ll need some standard machinery for type-level natural numbers. Probably all this stuff is in a library somewhere but I couldn’t be bothered to find out. Pointers welcome.

> -- Standard unary natural number type > data Nat :: * where > Z :: Nat > Suc :: Nat -> Nat > > type One = Suc Z > type Two = Suc One > type Three = Suc Two > type Four = Suc Three > type Six = Suc (Suc Four) > type Eight = Suc (Suc Six) > type Ten = Suc (Suc Eight) > type Thirteen = Suc (Suc (Suc Ten)) > > -- Singleton Nat-indexed natural numbers, to connect value-level and > -- type-level Nats > data SNat :: Nat -> * where > SZ :: SNat Z > SS :: Natural n => SNat n -> SNat (Suc n) > > -- A class for converting type-level nats to value-level ones > class Natural n where > toSNat :: SNat n > > instance Natural Z where > toSNat = SZ > > instance Natural n => Natural (Suc n) where > toSNat = SS toSNat > > -- A function for turning explicit nat evidence into implicit > natty :: SNat n -> (Natural n => r) -> r > natty SZ r = r > natty (SS n) r = natty n r > > -- The usual canonical finite type. Fin n has exactly n > -- (non-bottom) values. > data Fin :: Nat -> * where > FZ :: Fin (Suc n) > FS :: Fin n -> Fin (Suc n) > > finToInt :: Fin n -> Int > finToInt FZ = 0 > finToInt (FS n) = 1 + finToInt n > > deriving instance Eq (Fin n) Finiteness

Next, a type class to represent finiteness. For our purposes, a type a is finite if we can explicitly list its elements. For convenience we throw in decidable equality as well, since we will usually need that in conjunction. Of course, we have to be careful: although we can get a list of elements for a finite type, we don’t want to depend on the ordering. We must ensure that the output of the algorithm is independent of the order of elements.1 This is in fact true, although somewhat nontrivial to prove formally; I mention some of the intuitive ideas behind the proof below.

While we are at it, we give Finite instances for Fin n and for products of finite types.

> class Eq a => Finite a where > universe :: [a] > > instance Natural n => Finite (Fin n) where > universe = fins toSNat > > fins :: SNat n -> [Fin n] > fins SZ = [] > fins (SS n) = FZ : map FS (fins n) > > -- The product of two finite types is finite. > instance (Finite a, Finite b) => Finite (a,b) where > universe = [(a,b) | a <- universe, b <- universe] Division, inductively

Now we come to the division algorithm proper. The idea is that panGalacticPred turns an injection into an injection , and then we use induction on to repeatedly apply panGalacticPred until we get an injection .

> panGalacticDivision > :: forall a b n. (Finite a, Eq b) > => SNat n -> ((a, Fin (Suc n)) -> (b, Fin (Suc n))) -> (a -> b)

In the base case, we are given an injection , so we just pass a unit value in along with the and project out the .

> panGalacticDivision SZ f = \a -> fst (f (a, FZ))

In the inductive case, we call panGalacticPred and recurse.

> panGalacticDivision (SS n') f = panGalacticDivision n' (panGalacticPred n' f) Pan-Galactic Predecessor

And now for the real meat of the algorithm, the panGalacticPred function. The idea is that we swap outputs around until the function has the property that every output of the form corresponds to an input also of the form . That is, using the card game analogy, every spade in play should be in the leftmost spot (the spades spot) of some player’s hand (some spades can also be in the deck). Then simply dropping the leftmost card in everyone’s hand (and all the spades in the deck) yields a game with no spades. That is, we will have an injection . Taking predecessors everywhere (i.e. “hearts are the new spades”) yields the desired injection .

We need a Finite constraint on a so that we can enumerate all possible inputs to the function, and an Eq constraint on b so that we can compare functions for extensional equality (we iterate until reaching a fixed point). Note that whether two functions are extensionally equal does not depend on the order in which we enumerate their inputs, so far validating my claim that nothing depends on the order of elements returned by universe.

> panGalacticPred > :: (Finite a, Eq b, Natural n) > => SNat n > -> ((a, Fin (Suc (Suc n))) -> (b, Fin (Suc (Suc n)))) > -> ((a, Fin (Suc n)) -> (b, Fin (Suc n)))

We construct a function f' which is related to f by a series of swaps, and has the property that it only outputs FZ when given FZ as an input. So given (a,i) we can call f' on (a, FS i) which is guaranteed to give us something of the form (b, FS j). Thus it is safe to strip off the FS and return (b, j) (though the Haskell type checker most certainly does not know this, so we just have to tell it to trust us).

> panGalacticPred n f = \(a,i) -> second unFS (f' (a, FS i)) > where > unFS :: Fin (Suc n) -> Fin n > unFS FZ = error "impossible!" > unFS (FS i) = i

To construct f' we iterate a certain transformation until reaching a fixed point. For finite sets and this is guaranteed to terminate, though it is certainly not obvious from the Haskell code. (Encoding this in Agda so that it is accepted by the termination checker would be a fun (?) exercise.)

One round of the algorithm consists of two phases called “shape up” and “ship out” (to be described shortly).

> oneRound = natty n $ shipOut . shapeUp > > -- iterate 'oneRound' beginning with the original function... > fs = iterate oneRound f > -- ... and stop when we reach a fixed point. > f' = fst . head . dropWhile (uncurry (=/=)) $ zip fs (tail fs) > f1 =/= f2 = all (\x -> f1 x == f2 x) universe Encoding Card Games

Recall that a “card” is a pair of a value and a suit; we think of as the set of values and as the set of suits.

> type Card v s = (v, s) > > value :: Card v s -> v > value = fst > > suit :: Card v s -> s > suit = snd

Again, there are a number of players (one for each element of ), each of which has a “hand” of cards. A hand has a number of “spots” for cards, each one labelled by a different suit (which may not have any relation to the actual suit of the card in that position).

> type PlayerSpot p s = (p, s) > type Hand v s = s -> Card v s

A “game” is an injective function from player spots to cards. Of course, the type system is not enforcing injectivity here.

> type Game p v s = PlayerSpot p s -> Card v s

Some utility functions. First, a function to project out the hand of a given player.

> hand :: p -> Game p v s -> Hand v s > hand p g = \s -> g (p, s)

A function to swap two cards, yielding a bijection on cards.

> swap :: (Eq s, Eq v) => Card v s -> Card v s -> (Card v s -> Card v s) > swap c1 c2 = f > where > f c > | c == c1 = c2 > | c == c2 = c1 > | otherwise = c

leftmost finds the leftmost card in a player’s hand which has a given suit.

> leftmost :: Finite s => s -> Hand v s -> Maybe s > leftmost targetSuit h = find (\s -> suit (h s) == targetSuit) universe Playing Rounds

playRound abstracts out a pattern that is used by both shapeUp and shipOut. The first argument is a function which, given a hand, produces a function on cards; that is, based on looking at a single hand, it decides how to swap some cards around.2 playRound then applies that function to every hand, and composes together all the resulting permutations.

Note that playRound has both Finite s and Finite p constraints, so we should think about whether the result depends on the order of elements returned by any call to universe—I claimed it does not. Finite s corresponds to suits/spots, which corresponds to in the original problem formulation. explicitly has a canonical ordering, so this is not a problem. The Finite p constraint, on the face of it, is more problematic. We will have to think carefully about each of the rounds implemented in terms of playRound and make sure they do not depend on the order of players. Put another way, it should be possible for all the players to take their turn simultaneously.

> playRound :: (Finite s, Finite p, Eq v) => (Hand v s -> Card v s -> Card v s) -> Game p v s -> Game p v s > playRound withHand g = foldr (.) id swaps . g > where > swaps = map (withHand . flip hand g) players > players = universe Shape Up and Ship Out

Finally, we can describe the “shape up” and “ship out” phases, beginning with “shape up”. A “bad” card is defined as one having the lowest suit; make sure every hand with any bad cards has one in the leftmost spot (by swapping the leftmost bad card with the card in the leftmost spot, if necessary).

> shapeUp :: (Finite s, Finite p, Eq v) => Game p v s -> Game p v s > shapeUp = playRound shapeUp1 > where > badSuit = head universe > shapeUp1 theHand = > case leftmost badSuit theHand of > Nothing -> id > Just badSpot -> swap (theHand badSuit) (theHand badSpot)

And now for the “ship out” phase. Send any “bad” cards not in the leftmost spot somewhere else, by swapping with a replacement, namely, the card whose suit is the same as the suit of the spot, and whose value is the same as the value of the bad card in the leftmost spot. The point is that bad cards in the leftmost spot are OK, since we will eventually just ignore the leftmost spot. So we have to keep shipping out bad cards not in the leftmost spot until they all end up in the leftmost spot. For some intuition as to why this is guaranteed to terminate, consult Schwartz; note that columns tend to acquire more and more cards that have the same rank as a spade in the top spot (which never moves).

> shipOut :: (Finite s, Finite p, Eq v) => Game p v s -> Game p v s > shipOut = playRound shipOutHand > where > badSuit = head universe > spots = universe > shipOutHand theHand = foldr (.) id swaps > where > swaps = map (shipOut1 . (theHand &&& id)) (drop 1 spots) > shipOut1 ((_,s), spot) > | s == badSuit = swap (theHand spot) (value (theHand badSuit), spot) > | otherwise = id

And that’s it! Note that both shapeUp and shipOut are implemented by composing a bunch of swaps; in fact, in both cases, all the swaps commute, so the order in which they are composed does not matter. (For proof, see Schwartz.) Thus, the result is independent of the order of the players (i.e. the set A).

Enough code, let’s see an example! This example is taken directly from Doyle and Qiu’s paper, and the diagrams are being generated literally (literately?) by running the code in this blog post. Here’s the starting configuration:

Again, the spades are all highlighted in green. Recall that our goal is to get them all to be in the first row, but we have to do it in a completely deterministic, canonical way. After shaping up, we have:

Notice how the 6, K, 5, A, and 8 of spades have all been swapped to the top of their column. However, there are still spades which are not at the top of their column (in particular the 10, 9, and J) so we are not done yet.

Now, we ship out. For example, the 10 of spades is in the diamonds position in the column with the Ace of spades, so we swap it with the Ace of diamonds. Similarly, we swap the 9 of spades with the Queen of diamonds, and the Jack of spades with the 4 of hearts.

Shaping up does nothing at this point so we ship out again, and then continue to alternate rounds.

In the final deal above, all the spades are at the top of a column, so there is an injection from the set of all non-spade spots to the deck of cards with all spades removed. This example was, I suspect, carefully constructed so that none of the spades get swapped out into the undealt portion of the deck, and so that we end up with only spades in the top row. In general, we might end up with some non-spades also in the top row, but that’s not a problem. The point is that ignoring the top row gets rid of all the spades.

Anyway, I hope to write more about some “practical” examples and about what this has to do with combinatorial species, but this post is long enough already. Doyle and Qiu also describe a “short division” algorithm (the above is “long division”) that I hope to explore as well.

The rest of the code

For completeness, here’s the code I used to represent the example game above, and to render all the card diagrams (using diagrams 1.3).

> type Suit = Fin > type Rank = Fin > type Player = Fin > > readRank :: SNat n -> Char -> Rank n > readRank n c = fins n !! (fromJust $ findIndex (==c) "A23456789TJQK") > > readSuit :: SNat n -> Char -> Suit n > readSuit (SS _) 'S' = FZ > readSuit (SS (SS _)) 'H' = FS FZ > readSuit (SS (SS (SS _))) 'D' = FS (FS FZ) > readSuit (SS (SS (SS (SS _)))) 'C' = FS (FS (FS FZ)) > > readGame :: SNat a -> SNat b -> SNat n -> String -> Game (Player a) (Rank b) (Suit n) > readGame a b n str = \(p, s) -> table !! finToInt p !! finToInt s > where > table = transpose . map (map readCard . words) . lines $ str > readCard [r,s] = (readRank b r, readSuit n s) > > -- Example game from Doyle & Qiu > exampleGameStr :: String > exampleGameStr = unlines > [ "4D 6H QD 8D 9H QS 4C AD 6C 4S" > , "JH AH 9C 8H AS TC TD 5H QC JS" > , "KC 6S 4H 6D TS 9S JC KD 8S 8C" > , "5C 5D KS 5S TH JD AC QH 9D KH" > ] > > exampleGame :: Game (Player Ten) (Rank Thirteen) (Suit Four) > exampleGame = readGame toSNat toSNat toSNat exampleGameStr > > suitSymbol :: Suit n -> String > suitSymbol = (:[]) . ("♠♥♦♣"!!) . finToInt -- Huzzah for Unicode > > suitDia :: Suit n -> Diagram B > suitDia = (suitDias!!) . finToInt > > suitDias = map mkSuitDia (fins (toSNat :: SNat Four)) > mkSuitDia s = text' (suitSymbol s) # fc (suitColor s) # lw none > > suitColor :: Suit n -> Colour Double > suitColor n > | finToInt n `elem` [0,3] = black > | otherwise = red > > rankStr :: Rank n -> String > rankStr n = rankStr' (finToInt n + 1) > where > rankStr' 1 = "A" > rankStr' i | i <= 10 = show i > | otherwise = ["JQK" !! (i - 11)] > > text' t = stroke (textSVG' (TextOpts lin INSIDE_H KERN False 1 1) t) > > renderCard :: (Rank b, Suit n) -> Diagram B > renderCard (r, s) = mconcat > [ mirror label > , cardContent (finToInt r + 1) > , back > ] > where > cardWidth = 2.25 > cardHeight = 3.5 > cardCorners = 0.1 > mirror d = d d # rotateBy (1/2) > back = roundedRect cardWidth cardHeight cardCorners # fc white > # lc (case s of { FZ -> green; _ -> black }) > label = vsep 0.1 [text' (rankStr r), text' (suitSymbol s)] > # scale 0.6 # fc (suitColor s) # lw none > # translate ((-0.9) ^& 1.5) > cardContent n > | n <= 10 = pips n > | otherwise = face n # fc (suitColor s) # lw none > # sized (mkWidth (cardWidth * 0.6)) > pip = suitDia s # scale 1.1 > pips 1 = pip # scale 2 > pips 2 = mirror (pip # up 2) > pips 3 = pips 2 pip > pips 4 = mirror (pair pip # up 2) > pips 5 = pips 4 pip > pips 6 = mirror (pair pip # up 2) pair pip > pips 7 = pips 6 pip # up 1 > pips 8 = pips 6 mirror (pip # up 1) > pips 9 = mirror (pair (pip # up (2/3) pip # up 2)) pip # up (case finToInt s of {1 -> -0.1; 3 -> 0; _ -> 0.1}) > pips 10 = mirror (pair (pip # up (2/3) pip # up 2) pip # up (4/3)) > pips _ = mempty > up n = translateY (0.5*n) > pair d = hsep 0.4 [d, d] # centerX > face 11 = squares # frame 0.1 > face 12 = loopyStar > face 13 = burst # centerXY > squares > = strokeP (mirror (square 1 # translate (0.2 ^& 0.2))) > # fillRule EvenOdd > loopyStar > = regPoly 7 1 > # star (StarSkip 3) > # pathVertices > # map (cubicSpline True) > # mconcat > # fillRule EvenOdd > burst > = [(1,5), (1,-5)] # map r2 # fromOffsets > # iterateN 13 (rotateBy (-1/13)) > # mconcat # glueLine > # strokeLoop > > renderGame :: (Natural n, Natural a) => Game (Player a) (Rank b) (Suit n) -> Diagram B > renderGame g = hsep 0.5 $ map (\p -> renderHand p $ hand p g) universe > > renderHand :: Natural n => Player a -> Hand (Rank b) (Suit n) -> Diagram B > renderHand p h = vsep 0.2 $ map (renderCard . h) universe
  1. If we could program in Homotopy Type Theory, we could make this very formal by using the notion of cardinal-finiteness developed in my dissertation (see section 2.4).

  2. In practice this function on cards will always be a permutation, though the Haskell type system is not enforcing that at all. An early version of this code used the Iso type from lens, but it wasn’t really paying its way.


Categories: Offsite Blogs

FP Complete: Distributing our packages without a sysadmin

Planet Haskell - Tue, 05/12/2015 - 6:00pm

At FP Complete, we're no strangers to running complex web services. But we know from experience that the simplest service to maintain is one someone else is managing for you. A few days ago I described how secure package distribution with stackage-update and stackage-install works, focusing on the client side tooling. Today's blog post is about how we use Amazon S3, Github, and Travis CI to host all of this with (almost) no servers of our own (that caveat explained in the process).

Making executables available

We have two different Haskell tools needed to this hosting: hackage-mirror to copy the raw packages to Hackage, and all-cabal-hashes-tool to populate the raw cabal files with hash/package size information. But we don't want to have to compile this executables every time we call them. Instead, we'd like to simply download and run a precompiled executable.

Like many other Github projects, these two utilize Travis CI to build and test the code every time a commit is pushed. But that's not all; using Travis's deployment capability, they also upload an executable to S3.

Figuring out the details of making this work is a bit tricky, so it's easiest to just look at the .travis.yml file. For the security conscious: the trick is that Travis allows us to encrypt data so that no one but Travis can decrypt it. Then, Travis can decrypt and upload it to S3 for us.

Result: a fully open, transparent process for executable building that can be reviewed by anyone in the community, without allowing private credentials to be leaked. Also, notice how none of our own servers needed to get involved.

Running the executables

We're going to leverage Travis yet again, and use it to run the executables it so politely generated for us. We'll use all-cabal-hashes as our demonstration, though all-cabal-packages works much the same way. We have an update.sh script which downloads and runs our executable, and then commits, signs, and pushes to Github. In order to sign and push, however, we need to have a GPG and SSH key, respectively.

Once again, Travis's encryption capabilities come into play. In the .travis.yml file, we decrypt a tar file containing the GPG and SSH key, put them in the correct location, and also configure Git. Then we call out to the update.sh script. One wrinkle here is that Travis only supports having a single encrypted file per repo, which is why we have to tar together the two different keys, which is a minor annoyance.

As before, we have processes running on completely open, auditable systems. Uploads are being made to providers we don't manage (either Amazon or Github). The only thing kept hidden are the secrets themselves (keys). And if the process ever fails, I get an immediate notification from Travis. So far, that's only happened when I was playing with the build or Hackage was unresponsive.

Running regularly

It wouldn't be very useful if these processes weren't run regularly. This is a perfect place for a cron job. Unfortunately, Travis doesn't yet support cron job, though they seem to be planning it for the future. In the meanwhile, we do have to run this on our own service. Fortunately, it's a simple job that just asks Travis to restart the last build it ran for each repository.

To simplify even further, I run the Travis command line client from inside a Docker container, so that the only host system dependency is Docker itself. The wrapper script is:

#!/bin/bash set -e set -x docker run --rm -v /home/ubuntu/all-cabal-files-internal.sh:/run.sh:ro bilge/travis-cli /run.sh

The script that runs inside the Docker container is the following (token hidden to protect... well, me).

#!/bin/bash set -ex travis login --skip-version-check --org --github-token XXXXXXXXX # Trigger the package mirroring first, since it's used by all-cabal-hashes BUILD=$(travis branches --skip-version-check -r commercialhaskell/all-cabal-packages | grep "^hackage" | awk "{ print \$2 }") BUILDNUM=${BUILD###} echo BUILD=$BUILD echo BUILDNUM=$BUILDNUM travis restart --skip-version-check -r commercialhaskell/all-cabal-packages $BUILDNUM BUILD=$(travis branches --skip-version-check -r commercialhaskell/all-cabal-files | grep "^hackage" | awk "{ print \$2 }") BUILDNUM=${BUILD###} echo BUILD=$BUILD echo BUILDNUM=$BUILDNUM travis restart --skip-version-check -r commercialhaskell/all-cabal-files $BUILDNUM # Put in a bit of a delay to allow the all-cabal-packages job to finish. If # not, no big deal, next job will pick up the change. sleep 30 BUILD=$(travis branches --skip-version-check -r commercialhaskell/all-cabal-hashes | grep "^hackage" | awk "{ print \$2 }") BUILDNUM=${BUILD###} echo BUILD=$BUILD echo BUILDNUM=$BUILDNUM travis restart --skip-version-check -r commercialhaskell/all-cabal-hashes $BUILDNUMConclusion

Letting someone else deal with our file storage, file serving, executable building, and update process is a massive time saver. Now our sysadmins can stop dealing with these problems, and start solving complicated problems. The fact that everyone can inspect, learn from, and understand what our services are doing is another advantage. I encourage others to try out these kinds of deployments whenever possible.

Categories: Offsite Blogs