News aggregator

FP Complete: Devops best practices: Multifaceted Testing

Planet Haskell - Mon, 11/28/2016 - 12:00pm

Even among skilled enterprise IT departments, it is too rare that software is thoroughly tested before deployment. Failed deployments mean costly downtime, service failures, upset users, and even security breaches. How can we verify that a solution is actually ready to deploy, free of serious defects?

Contact us for professional assistance

You probably need more kinds of tests

We’ve all seen “ready to deploy” applications that do not work as expected once deployed. Often it’s because the production system is not in fact identical to the staging system, so the testing wasn’t valid. This can be prevented with another devops best practice, automated deployments -- which we talked about in this recent post and will return to again.

But often the problem is that the software, even on a properly configured test system and staging system, was never fully tested. Before an app is approved for deployment, your QA system (mostly automated) should complete:

  • Success tests
  • Failure tests
  • Corner-case tests
  • Randomized tests (also called mutation tests, fuzz tests)
  • Load and performance tests
  • Usability tests
  • Security tests

(If you were not using automated, reproducible deployments, you would also have to do explicit pre-tests of the deployment script itself -- but you are doing fully automated deployments, right? If you aren’t, consider moving to the sorts of tools we use in FP Deploy -- like Docker and Kubernetes, Puppet and Ansible.)

In the rest of this article we’ll see how each kind of testing adds something different and important.

Success testing: run with realistic inputs

Most operations teams won’t accept a deployment from the engineering group (dev or test or QA) unless the system has at least passed success testing. When presented with correct inputs, does the system generate correct outputs and not crash?

It’s the most basic testing. Yet it’s often left incomplete. To avoid serious omissions, use at least this checklist:

Did you play back a realistic workload, based on a real sample of typical inputs? If not, your test’s idea of what to try may be very different from what your users will actually do in the first day in production. Huge source of real-world failures. Did you test through both the UI and the API? Automated testing only through the UI may mask problems that the UI prevents or corrects. Automated testing only through the API can’t find pure UI bugs. Were your tests updated when the spec was updated? For that matter, was the spec even updated when the software’s intended function was updated? If “no” on either of these, you are not actually testing the software’s correct function. Do you have coverage? Has someone gone back through the spec and identified all the promised functionality, and verified that some test actually tests each item -- and that it was run on the latest build, and passed? Did you test in a system that’s configured the same as your production system? Especially if deployments are not automated, it is crucial to ensure that nothing was done in the staging system to make it easier to pass than the production system. Common examples: omitting firewalls that would be present between layers in production; running test services under accounts with excessive permissions; having a manual security checklist for production deployments that is not used on staging deployments; sequencing inputs so that multiple simulated users cannot appear concurrently. Passing tests may not mean much if the environment is rigged to ensure success.

Believe it or not, that was the easy part. For enterprise production quality, you still want to test your system six more ways. Jumping right in to number two...

Failure testing: when you break the law, do you go to jail?

Your testing is all under an automated set of continuous integration (CI) scripts, right? (If not, time to look into that.) But do you have tests that force all of the specified error conditions to occur? Do you pass in every identified kind of prohibited/invalid input? Do you also create all the realistic external error conditions, like a network link failure, a timeout, a full disk, low memory (with a tool like Chaos Monkey)?

A test suite doesn’t check whether specified error conditions actually generate the right errors is not complete. We recommend that the QA team present a report showing a list of all existing tests, what conditions they purport to test, and when they were last run and passed.

Corner-case testing: try something crazy

Maybe you’ve heard the joke: a QA engineer walks into a bar, and orders a beer, and 2 beers, and 20 beers, and 0 beers, and a million beers, and -1 beers. And a duck.

Corner-case testing means success testing using unrealistic but legal inputs. Often, developers write code that works correctly in typical cases, but fails in the extremes.

Before deploying, consider: are any of your users going to try anything crazy? What would be the oddest things still permitted, and what happens if you try? Who tested that and verified that the output was correct? Correct code works on all permitted inputs, not just average ones. This is a fast way to find bugs before deployment -- push the system right to the edge of what it should be able to do.

Corner cases vary by application, but here are some typical examples to spark your thinking. Where strings are permitted, what happens if they are in a very differently structured language, like Chinese or Arabic? What happens if they are extremely long? Where numbers are permitted, what happens if they are very large, very small, zero, negative? And why is the permitted range as big as it is; should it be reduced? Is it legal to request output of a billion records, and what happens if I do? Where options are permitted, what if someone chooses all of them, or a bizarre mixture? Can I order a pizza with 30 toppings? Can I prescribe 50 medicines, at 50 bottles each, for a sample patient? What happens if nested or structured inputs are extremely complex? Can I send an email with 100 embeddings, each of which is an email with 100 embeddings?

If an application hasn’t been tested with ridiculous-but-legal inputs, no one really knows if it’s going to hold up in production.

Randomized testing: never saw that before!

No human team can test every possible combination of cases and actions. And that may be okay, because many projects find more bugs per unit of effort through randomized generation of test cases than any other way.

This means writing scripts that start with well-understood inputs, and then letting them make random, arbitrary changes to these inputs and run again, then change and run again, many thousands of times. Even if it’s not realistic to test the outputs for correctness (because the script may be unable to tell what its crazy inputs were supposed to do), the outputs can be tested for structural validity -- and the system can be watched for not crashing, and not generating any admin alerts or unhandled errors or side effects.

It’s downright surprising how fast you can find bugs in a typical unsafe language (like Python or C or Java) through simple mutation testing. Extremely safe languages like Haskell tend to find these bugs at compile time, but it may still be worth trying some randomized testing. Remember, machine time is cheap; holes in deployed code are very expensive.

Load and performance testing: now that's heavy

Companies with good devops say heavy load is one of the top remaining sources of failure. The app works for a while, but fails when peak user workload hits. Be on the lookout for conditions that could overload your servers, and make sure someone is forcing them to happen on the staging system -- before they happen in production.

Consider whether your test and staging systems are similar enough to your production system. If your production system accepts 5000 requests per second on 10 big machines, and your test system accepts 5 per second on one tiny VM, how will you know about database capacity issues or network problems?

A good practice is to throw enormous, concurrent, simulated load at your test system that (1) exceeds any observed real-world load and (2) includes a wide mix of realistic inputs, perhaps a stream of historic real captured inputs as well as random ones. This reduces the chance that you threw a softball at the system when real users are going to throw a hardball.

Performance testing can include sending faster and faster inputs until some hardware resource becomes saturated. (You may enjoy watching system monitor screens as this is happening!) Find the bottleneck -- what resource can you expect to fail first in production? How will you prevent it? Do your deployment scripts specify an abundance of this resource? Have you implemented cloud auto-scaling so that new parallel servers are fired up when the typically scarce resource (CPU, RAM, network link, …) gets too busy?

Usability testing: it doesn't work if people can't use it

Most people consider this to be outside the realm of devops. Who cares if users find your system confusing and hard to use? Well, lots of people, but why should a devops person care?

What will happen to your production environment if a new feature is deployed and suddenly user confusion goes through the roof? Will they think there’s a bug? Will support calls double in an hour, and stay doubled? Will you be forced to do a rollback?

User interface design probably isn’t your job. But if you are deploying user-facing software that has not been usability tested, you’re going to hear about it. Encourage your colleagues to do real testing of their UI on realistic, uninitiated users (not just team members who know what the feature is supposed to do), or at least skeptical test staff who know how to try naive things on purpose, before declaring a new feature ready to deploy.

Security testing: before it's too late

One of the worst things you can do is to cause a major security breach, leading to a loss of trust and exposure of users’ private data.

Testing a major, public-facing, multi-server device (a distributed app) for security is a big topic, and I’d be doing a disservice by trying to summarize it in just a couple of paragraphs. We’ll return in future posts to both the verification and testing side, and the design and implementation side, of security. A best practice is to push quality requirements upstream, letting developers know that security is their concern too, and ensuring that integration-test systems use a secure automated deployment similar or identical to the production system. Don’t let developers say “I assume you’ll secure this later.”

Meanwhile, as a devops best practice, your deployment and operations team should have at least one identified security expert, a person whose job includes knowing about all the latest security test tools and ensuring that they are being used where appropriate. Are you checking for XSS attacks and SQL injection attacks? DDoS attacks? Port-scan attacks? Misconfigured default accounts? It’s easy to neglect security until it’s too late, so make someone responsible.

Security holes can appear in application code, or in the platform itself (operating system, library packages, and middleware). At a minimum, security testing should include running standard off-the-shelf automated scanning software that looks for known ways to intrude, most often taking advantage of poor default configurations, or of platform components that have not been upgraded to the latest patch level. Run an automated security scan before moving a substantially changed system from staging into production. Automated testing is almost free, in stark contrast to the costly manual clean-up after a breach.


Wow, that’s a lot of testing! More than a lot of companies actually do. Yet it’s quite hard to look back through that list and find something that’s okay to omit. Clearly the devops pipeline doesn’t begin at the moment of deployment, but sooner, in the development team itself. That means developers and testers taking responsibility for delivering a quality product that’s actually ready to deploy.

As usual, systems thinking wins. A good operations team doesn’t say “give us what you’ve got, and we’ll somehow get it online.” A good operations team says “we offer deployment and operation services, and here’s how you can give us software that will deploy successfully into our production environment.”

We hope you’ll keep reading our blog and making use of what we have learned. Thanks for spending time with FP Complete.

Contact us to help your engineering and devops teams

Categories: Offsite Blogs

Ken T Takusagawa: [xxneozpu] Demonstration of Data.Numbers.Fixed

Planet Haskell - Mon, 11/28/2016 - 12:38am

Here is a little demonstration of arbitrary precision floating-point (actually fixed-point) arithmetic in Haskell, using Data.Numbers.Fixed in the numbers package.  Using dynamicEps, we calculate sqrt(1/x) to arbitrary precision, outputting the number in a user-specified base.

Performance is not so great; the package implements Floating functions using elegant but not necessarily high performance algorithms based on continued fractions (Gosper, HAKMEM).  The hmpfr package might be better.  (For square root, it might also be easy to roll one's own implementation of Newton's method on Rational.)

Radix conversion of a fractional number is implemented as an unfold, similar to radix conversion of an integer.

Categories: Offsite Blogs

Michael Snoyman: Haskell Documentation, 2016 Update

Planet Haskell - Sun, 11/27/2016 - 6:00pm

I've blogged, Tweeted, and conversed about Haskell documentation quite a bit in the past. Following up on tooling issues, all available evidence tells me that improving the situation for documentation in Haskell is the next obstacle we need to knock down.

This blog post will cover:

  • Where I think the biggest value is to be had in improving Haskell documentation
  • Status of various initiatives I've been involved in (and boy do I walk away red-faced from this)
  • Recommendations for how others can best contribute
  • A basic idea of what actions I - and others at FP Complete - have been taking
Intermediate docs

In my opinion, the sore spot for Haskell overall is intermediate docs. (Yes, that's vague, bear with me momentarily.) I'm going to posit that:

  • Beginners are currently well served by introductory Haskell books, and most recently by Haskell Programming from First Principles
  • Once you have a solid basis in intermediate concepts, it's much easier to jump into libraries and academic papers and understand what's going on

To me, intermediate means you already know the basics of Haskell syntax, monads, and common typeclasses, but aren't really familiar with any non-base libraries, concurrency, or exception handling. The goal of intermediate documentation is to:

  • Teach which libraries to use, when to use them, and how to use them
  • Give a guide on structuring Haskell programs
  • Educate on important techniques, including those mentioned above, as well as issues around lazy evaluation and other common stumbling blocks

Many of us who have learned Haskell over the past many years have probably picked up these topics sporadically. While some people will want to plow ahead in that kind of haphazard approach, my belief is that the vast majority of users want to be more guided through the process. We'll get to the Haskell Syllabus and opinionated vs unopinionated content a bit later.

Previous efforts

It turns out, as I'm quite embarassed to admit, that I've essentially tried reinventing the same intermediate docs concept multiple times, first with MezzoHaskell, and then with the Commercial Haskell doc initiative. You may also include School of Haskell in that list too, but I'm going to treat it separately.

These initiatives never took off. A pessimistic view is that Haskellers are simply uninterested in contributing to such a shared body of intermediate-level docs. I actually believed that for a bit, but recent activity has convinced me otherwise. I think these previous initiatives failed due to an unsatisfactory user experience. These initiatives required people to go to an infrequently used Github repo to view docs, which no one was doing. A few months back, a new option presented itself.

haskell-lang's documentation

For those who haven't seen it, you should check out the libraries page and documentation page on the site. I believe this hits the nail on the head in many different ways:

  • It directly addresses a common user question: which are the best libraries to use for a certain task? The libraries page gives this answer (though it can certainly be improved and expanded!)
  • We're able to curate the documentation as a community. Providing a list of recommended documents on this site gives a reader more confidence than simply Googling and hoping the author knows what he/she is talking about
  • The collaboration is done via pull requests on markdown files. I've discussed previously why I think this is a far better collaboration technique than Wikis or other options.
  • Instead of requiring all docs live within the repository, documents can be embedded from elsewhere. For example, I've written a conduit tutorial in the conduit repository, and embedded its content on via a simple inclusion mechanism. This allows authors to maintain their documentation individually, but provide users with a central location to find these kinds of documents. (I'd encourage other sites to take advantage of this transclusion technique, getting quality content into user hands is the goal!)

haskell-lang tries to host only "uncontroversial" documentation. Documents explaining how to use a library are pretty straightforward. Recommending libraries like bytestring, text, and vector are all pretty well accepted. And for cases where multiple libraries are used, we link to both.

I've merged all of the content I wrote in MezzoHaskell and the Commercial Haskell doc initiative into where it fit. However, there was still some more controversial content left, such as exceptions best practices, which I know many people disagree with me about. Also, I'd like to be able to tell a user looking for a solution, "yes, there are multiple libraries around, I recommend X." Neither of these belong on a community site like haskell-lang, so for those...

More opinionated content

This is where alternative sites thrive. Since I'm collaborating with others at FP Complete on this, and actively using this in training courses, I've put together a Haskell Syllabus page page. This is where I'll tell someone "you should do X, even though others disagree with me." I won't enumerate the contentious decisions here (odds are someone else will ultimately make such a list on my behalf).

And if you disagree with this? Write a new syllabus! I think it would be a healthy thing if we could get to the point as a community where we could have multiple recommended, opinionated syllabuses, and link to them with a short description for each one. This may sound at odds with some of my previous statements, so let me clarify:

  • When there's an obviously best choice, tell the user to use it
  • When most users will be best with one choice, and another option is available, mention it as a footnote
  • When multiple options are available and there's no way to know which the user will want, break down and give them all the information they need. But...
  • Try to make that happen as infrequently - and as late in the learning process - as possible! If we could have a "you've completed Beginner Haskell, please choose between one of the following," and explain the difference between "FP Complete's course" vs (for example) "lens-first Haskell", that would be a good trade-off.

My thoughts on this are still evolving, and will likely change in the future as I get more feedback from users.

Writing style

Another big change I've made over the years is writing style. I wrote the Yesod book in a very prose-heavy manner, focusing on explaining details with words, and using concise, to-the-point code examples. Many users have given me feedback to push me in a different direction. Instead, I've recently been writing trying to write in a very different, and thankfully easier to write, style:

  • Short explanation of what the thing is I'm talking about and when you'd use it
  • Synopsis: medium sized code snippet to give a flavor (I used this in the Yesd book too, and stole the idea straight from Perl docs)
  • A series of increasingly complex examples, with the bare minimum amount of content around it to explain what's going on

I'd put this style into a hybrid of tutorial and cookbook, and think it works well overall. I've only heard positives so far versus previous styles, so that's encouraging. Some examples:

I'm taking this approach because I think it's what most users want. Some important points:

  • Not all users are the same! There will almost certainly be users who would prefer a different style of docs. Given enough user feedback and manpower to write docs, it would be great to cater to all tastes, but it's best right now to focus on the highest demand
  • API docs are still necessary, and are completely orthogonal to tutorials. A tutorial doesn't document each API call, an API-call-level explanation doesn't give enough breadth, and certainly users need more than just the type signatures.
What you can do

After all of that, my recommendation on how to get involved is pretty simple:

  • Pick a library that doesn't have a good tutorial
  • Write a tutorial
  • Submit a PR to haskell-lang to include the content
  • Alternatively: get the Markdown file included in the project's repo instead, and include the remote file instead
Linking to libraries

haskell-lang has a nifty feature. If you visit, it will display the vector documentation it has. But if you visit a package like which doesn't (yet) have a tutorial on haskell-lang, it will automatically redirect you to the Stackage package page. When giving out links to people on the internet, I recommend using the link.

  • When a tutorial is available, the haskell-lang page is great
  • When a tutorial isn't available, the doc building on Stackage is still the most reliable around
  • In addition, Stackage docs properly link to docs built in the same Stackage snapshot, making the cross linking more reliable
  • When present, Stackage gives preference to the files in a package, which are generally more useful than the description fields.
School of Haskell

I'd be remiss in not mentioning School of Haskell here. As far as I'm concerned, School of Haskell is great platform for an individual to write content without any collaboration. However, for all of the cases I'm describing here, some kind of easy collaboration (via pull requests) is a huge win. Opening things up more with pull requests, files, and embedding content into multiple external sites seems like the best option today (until someone comes up with something better!).

Categories: Offsite Blogs

Manuel M T Chakravarty: Here is the video of my talk “A Type is Worth a Thousand...

Planet Haskell - Fri, 11/25/2016 - 9:38pm
<iframe allowfullscreen="allowfullscreen" frameborder="0" height="225" mozallowfullscreen="mozallowfullscreen" src=";byline=0&amp;portrait=0" title="A Type is Worth a Thousand Tests" webkitallowfullscreen="webkitallowfullscreen" width="400"></iframe>

Here is the video of my talk “A Type is Worth a Thousand Tests” presented at Sydney CocoaHeads, November 2016 (you can also get the slides) — I previously presented this talk at YOW! Connected, Melbourne, and an earlier version at Curry On in Rome.

In this talk, I argue that types are a design tool that, if applied correctly, reduces the need for tests. I illustrate this at the example of the design of a simple iPhone app in Swift whose source code is available on GitHub.

Categories: Offsite Blogs

Joe Armstrong Interviews Alan Kay

Lambda the Ultimate - Fri, 11/25/2016 - 9:25am

Youtube video (via HN)

By far not the best presentation of Kay's ideas but surely a must watch for fans. Otherwise, skip until the last third of the interview which might add to what most people here already know.

It is interesting that in this talk Kay rather explicitly talks about programming languages as abstraction layers. He also mentions some specifics that may not be as well known as others, yet played a role in his trajectory, such as META.

I fully sympathize with his irritation with the lack of attention to and appreciation of fundamental concepts and theoretical frameworks in CS. On the other hand, I find his allusions to biology unconvincing.
An oh, he is certainly right about Minsky's book (my first introduction to theoretical CS) and in his deep appreciation of John McCarthy.

Categories: Offsite Discussion

Derek Elkins: Category Theory, Syntactically

Planet Haskell - Fri, 11/25/2016 - 12:23am
(or: How Model Theory Got Scooped by Category Theory) Overview

This will be a very non-traditional introduction to the ideas behind category theory. It will essentially be a slice through model theory (presented in a more programmer-friendly manner) with an unusual organization. Of course the end result will be ***SPOILER ALERT*** it was category theory all along. A secret decoder ring will be provided at the end. This approach is inspired by the notion of an internal logic/language and by Vaughn Pratt’s paper The Yoneda Lemma Without Category Theory.

I want to be very clear, though. This is not meant to be an analogy or an example or a guide for intuition. This is category theory. It is simply presented in a different manner.


The first concept we’ll need is that of a theory. If you’ve ever implemented an interpreter for even the simplest language, than most of what follows modulo some terminological differences should be both familiar and very basic. If you are familiar with algebraic semantics, then that is exactly what is happening here only restricting to unary (but multi-sorted) algebraic theories.

For us, a theory, #ccT#, is a collection of sorts, a collection of (unary) function symbols1, and a collection of equations. Each function symbol has an input sort and an output sort which we’ll call the source and target of the function symbol. We’ll write #ttf : A -> B# to say that #ttf# is a function symbol with source #A# and target #B#. We define #"src"(ttf) -= A# and #"tgt"(ttf) -= B#. Sorts and function symbols are just symbols. Something is a sort if it is in the collection of sorts. Nothing else is required. A function symbol is not a function, it’s just a, possibly structured, name. Later, we’ll map those names to functions, but the same name may be mapped to different functions. In programming terms, a theory defines an interface or signature. We’ll write #bb "sort"(ccT)# for the collection of sorts of #ccT# and #bb "fun"(ccT)# for the collection of function symbols.

A (raw) term in a theory is either a variable labelled by a sort, #bbx_A#, or it’s a function symbol applied to a term, #tt "f"(t)#, such that the sort of the term #t# matches the source of #ttf#. The sort or target of a term is the sort of the variable if it’s a variable or the target of the outermost function symbol. The source of a term is the sort of the innermost variable. In fact, all terms are just sequences of function symbol applications to a variable, so there will always be exactly one variable. All this is to say the expressions need to be “well-typed” in the obvious way. Given a theory with two function symbols #ttf : A -> B# and #ttg : B -> A#, #bbx_A#, #bbx_B# , #tt "f"(bbx_A)#, and #tt "f"(tt "g"(tt "f"(bbx_A)))# are all examples of terms. #tt "f"(bbx_B)# and #tt "f"(tt "f"(bbx_A))# are not terms because they are not “well-typed”, and #ttf# by itself is not a term simply because it doesn’t match the syntax. Using Haskell syntax, we can define a data type representing this syntax if we ignore the sorting:

data Term = Var Sort | Apply FunctionSymbol Term

Using GADTs, we could capture the sorting constraints as well:

data Term (s :: Sort) (t :: Sort) where Var :: Term t t Apply :: FunctionSymbol x t -> Term s x -> Term s t

An important operation on terms is substitution. Given a term #t_1# with source #A# and a term #t_2# with target #A# we define the substitution of #t_2# into #t_1#, written #t_1[bbx_A |-> t_2]#, as:

If #t_1 = bbx_A# then #bbx_A[bbx_A |-> t_2] -= t_2#.

If #t_1 = tt "f"(t)# then #tt "f"(t)[bbx_A |-> t_2] -= tt "f"(t[bbx_A |-> t_2])#.

Using the theory from before, we have:

#tt "f"(bbx_A)[bbx_A |-> tt "g"(bbx_B)] = tt "f"(tt "g"(bbx_B))#

As a shorthand, for arbitrary terms #t_1# and #t_2#, #t_1(t_2)# will mean #t_1[bbx_("src"(t_1)) |-> t_2]#.

Finally, equations2. An equation is a pair of terms with equal source and target, for example, #(: tt "f"(tt "g"(bbx_B)), bbx_B :)#. The idea is that we want to identify these two terms. To do this we quotient the set of terms by the congruence generated by these pairs, i.e. by the reflexive-, symmetric-, transitive-closure of the relation generated by the equations which further satisfies “if #s_1 ~~ t_1# and #s_2 ~~ t_2# then #s_1(s_2) ~~ t_1(t_2)#”. From now on, by “terms” I’ll mean this quotient with “raw terms” referring to the unquotiented version. This means that when we say “#tt "f"(tt "g"(bbx_B)) = bbx_B#”, we really mean the two terms are congruent with respect to the congruence generated by the equations. We’ll write #ccT(A, B)# for the collection of terms, in this sense, with source #A# and target #B#. To make things look a little bit more normal, I’ll write #s ~~ t# as a synonym for #(: s, t :)# when the intent is that the pair represents a given equation.

Expanding the theory from before, we get the theory of isomorphisms, #ccT_{:~=:}#, consisting of two sorts, #A# and #B#, two function symbols, #ttf# and #ttg#, and two equations #tt "f"(tt "g"(bbx_B)) ~~ bbx_B# and #tt "g"(tt "f"(bbx_A)) ~~ bbx_A#. The equations lead to equalities like #tt "f"(tt "g"(tt "f"(bbx_A))) = tt "f"(bbx_A)#. In fact, it doesn’t take much work to show that this theory only has four distinct terms: #bbx_A#, #bbx_B#, #tt "f"(bbx_A)#, and #tt "g"(bbx_B)#.

In traditional model theory or universal algebra we tend to focus on multi-ary operations, i.e. function symbols that can take multiple inputs. By restricting ourselves to only unary function symbols, we expose a duality. For every theory #ccT#, we have the opposite theory, #ccT^(op)# defined by using the same sorts and function symbols but swapping the source and target of the function symbols which also requires rewriting the terms in the equations. The rewriting on terms is the obvious thing, e.g. if #ttf : A -> B#, #ttg : B -> C#, and #tth : C -> D#, then the term in #ccT#, #tt "h"(tt "g"(tt "f"(bbx_A)))# would become the term #tt "f"(tt "g"(tt "h"(bbx_D)))# in #ccT^(op)#. From this it should be clear that #(ccT^(op))^(op) = ccT#.

Product Theories

Given two theories #ccT_1# and #ccT_2# we can form a new theory #ccT_1 xx ccT_2# called the product theory of #ccT_1# and #ccT_2#. The sorts of this theory are pairs of sorts from #ccT_1# and #ccT_2#. The collection of function symbols is the disjoint union #bb "fun"(ccT_1) xx bb "sort"(ccT_2) + bb "sort"(ccT_1) xx bb "fun"(ccT_2)#. A disjoint union is like Haskell’s Either type. Here we’ll write #tt "inl"# and #tt "inr"# for the left and right injections respectively. #tt "inl"# takes a function symbol from #ccT_1# and a sort from #ccT_2# and produces a function symbol of #ccT_1 xx ccT_2# and similarly for #tt "inr"#. If #tt "f" : A -> B# in #ccT_1# and #C# is a sort of #ccT_2#, then #tt "inl"(f, C) : (A, C) -> (B, C)# and similarly for #tt "inr"#.

The collection of equations for #ccT_1 xx ccT_2# consists of the following:

  • for every equation, #l ~~ r# of #ccT_1# and every sort, #C#, of #ccT_2# we produce an equation #l’ ~~ r’# by replacing each function symbol #ttf# in #l# and #r# with #tt "inl"(tt "f", C)#
  • similarly for equations of #ccT_2#
  • for every pair of function symbols #ttf : A -> B# from #ccT_1# and #ttg : C -> D# from #ccT_2#, we produce the equation #tt "inl"(tt "f", D)(tt "inr"(A, tt "g")(bbx_{:(A, C")":})) ~~ tt "inr"(B, tt "g")("inl"(tt "f", C)(bbx_{:(A, C")":}))#

The above is probably unreadable. If you work through it, you can show that every term of #ccT_1 xx ccT_2# is equivalent to a pair of terms #(t_1, t_2)# where #t_1# is a term in #ccT_1# and #t_2# is a term in #ccT_2#. Using this equivalence, the first bullet is seen to be saying that if #l = r# in #ccT_1# and #C# is a sort in #ccT_2# then #(l, bbx_C) = (r, bbx_C)# in #ccT_1 xx ccT_2#. The second is similar. The third then states

#(t_1, bbx_C)((bbx_A, t_2)(bbx_{:"(A, C)":})) = (t_1, t_2)(bbx_{:"(A, C)":}) = (bbx_A, t_2)((t_1, bbx_C)(bbx_{:"(A, C)":}))#.

To establish the equivalence between terms of #ccT_1 xx ccT_2# and pairs of terms from #ccT_1# and #ccT_2#, we use the third bullet to move all the #tt "inl"#s outward at which point we’ll have a sequence of #ccT_1# function symbols followed by a sequence of #ccT_2# function symbols each corresponding to term.

The above might seem a bit round about. An alternative approach would be to define the function symbols of #ccT_1 xx ccT_2# to be all pairs of all the terms from #ccT_1# and #ccT_2#. The problem with this approach is that it leads to an explosion in the number of function symbols and equations required. In particular, it easily produces an infinitude of function symbols and equations even when provided with theories that only have a finite number of sorts, function symbols, and equations.

As a concrete and useful example, consider the theory #ccT_bbbN# consisting of a single sort, #0#, a single function symbol, #tts#, and no equations. This theory has a term for each natural number, #n#, corresponding to #n# applications of #tts#. Now let’s articulate #ccT_bbbN xx ccT_bbbN#. It has one sort, #(0, 0)#, two function symbols, #tt "inl"(tt "s", 0)# and #tt "inr"(0, tt "s")#, and it has one equation, #tt "inl"(tt "s", 0)(tt "inr"(0, tt "s")(bbx_{:(0, 0")":})) ~~ tt "inr"(0, tt "s")("inl"(tt "s", 0)(bbx_{:(0, 0")":}))#. Unsurprisingly, the terms of this theory correspond to pairs of natural numbers. If we had used the alternative definition, we’d have had an infinite number of function symbols and an infinite number of equations.

Nevertheless, for clarity I will typically write a term of a product theory as a pair of terms.

As a relatively easy exercise — easier than the above — you can formulate and define the disjoint sum of two theories #ccT_1 + ccT_2#. The idea is that every term of #ccT_1 + ccT_2# corresponds to either a term of #ccT_1# or a term of #ccT_2#. Don’t forget to define what happens to the equations.

Related to these, we have the theory #ccT_{:bb1:}#, which consists of one sort and no function symbols or equations, and #ccT_{:bb0:}# which consists of no sorts and thus no possibility for function symbols or equations. #ccT_{:bb1:}# has exactly one term while #ccT_{:bb0:}# has no terms.


Sometimes we’d like to talk about function symbols whose source is in one theory and target is in another. As a simple example, that we’ll explore in more depth later, we may want function symbols whose sources are in a product theory. This would let us consider terms with multiple inputs.

The natural way to achieve this is to simply make a new theory that contains sorts from both theories plus the new function symbols. A collage, #ccK#, from a theory #ccT_1# to #ccT_2#, written #ccK : ccT_1 ↛ ccT_2#, is a theory whose collection of sorts is the disjoint union of the sorts of #ccT_1# and #ccT_2#. The function symbols of #ccK# consist for each function symbol #ttf : A -> B# in #ccT_1#, a function symbol #tt "inl"(ttf) : tt "inl"(A) -> tt "inl"(B)#, and similarly for function symbols from #ccT_2#. Equations from #ccT_1# and #ccT_2# are likewise taken and lifted appropriately, i.e. #ttf# is replaced with #tt "inl"(ttf)# or #tt "inr"(ttf)# as appropriate. Additional function symbols of the form #k : tt "inl"(A) -> tt "inr"(Z)# where #A# is a sort of #ccT_1# and #Z# is a sort of #ccT_2#, and potentially additional equations involving these function symbols, may be given. (If no additional function symobls are given, then this is exactly the disjoint sum of #ccT_1# and #ccT_2#.) These additional function symbols and equations are what differentiate two collages that have the same source and target theories. Note, there are no function symbols #tt "inr"(Z) -> tt "inl"(A)#, i.e. where #Z# is in #ccT_2# and #A# is in #ccT_1#. That is, there are no function symbols going the “other way”. To avoid clutter, I’ll typically assume that the sorts and function symbols of #ccT_1# and #ccT_2# are disjoint already, and dispense with the #tt "inl"#s and #tt "inr"#s.

Summarizing, we have #ccK(tt "inl"(A), "inl"(B)) ~= ccT_1(A, B)#, #ccK(tt "inr"(Y), tt "inr"(Z)) ~= ccT_2(Y, Z)#, and #ccK(tt "inr"(Z), tt "inl"(A)) = O/# for all #A#, #B#, #Y#, and #Z#. #ccK(tt "inl"(A), tt "inr"(Z))# for any #A# and #Z# is arbitrary generated. To distinguish them, I’ll call the function symbols that go from one theory to another bridges. More generally, an arbitrary term that has it’s source in one theory and target in another will be described as a bridging term.

Here’s a somewhat silly example. Consider #ccK_+ : ccT_bbbN xx ccT_bbbN ↛ ccT_bbbN# that has one bridge #tt "add" : (0, 0) -> 0# with the equations #tt "add"(tt "inl"(tts, 0)(bbx_("("0, 0")"))) ~~ tts(tt "add"(bbx_("("0, 0")")))# and #tt "add"(tt "inr"(0, tts)(bbx_("("0, 0")"))) ~~ tts(tt "add"(bbx_("("0, 0")")))#.

More usefully, if a bit degenerately, every theory induces a collage in the following way. Given a theory #ccT#, we can build the collage #ccK_ccT : ccT ↛ ccT# where the bridges consist of the following. For each sort, #A#, of #ccT#, we have the following bridge: #tt "id"_A : tt "inl"(A) -> tt "inr"(A)#. Then, for every function symbol, #ttf : A -> B# in #ccT#, we have the following equation: #tt "inl"(tt "f")(tt "id"_A(bbx_(tt "inl"(A)))) ~~ tt "id"_B(tt "inr"(tt "f")(bbx_(tt "inl"(A))))#. We have #ccK_ccT(tt "inl"(A), tt "inr"(B)) ~= ccT(A, B)#.

You can think of a bridging term in a collage as a sequence of function symbols partitioned into two parts by a bridge. Naturally, we might consider partitioning into more than two parts by having more than one bridge. It’s easy to generalize the definition of collage to combine an arbitrary number of theories, but I’ll take a different, probably less good, route. Given collages #ccK_1 : ccT_1 ↛ ccT_2# and #ccK_2 : ccT_2 ↛ ccT_3#, we can make the collage #ccK_2 @ ccK_1 : ccT_1 ↛ ccT_3# by defining its bridges to be triples of a bridge of #ccK_1#, #k_1 : A_1 -> A_2#, a term, #t : A_2 -> B_2# of #ccT_2#, and a bridge of #ccK_2#, #k_2 : B_2 -> B_3# which altogether will be a bridge of #ccK_2 @ ccK_1# going from #A_1 -> B_3#. These triples essentially represent a term like #k_2(t(k_1(bbx_(A_1))))#. With this intuition we can formulate the equations. For each equation #t'(k_1(t_1)) ~~ s'(k'_1(s_1))# where #k_1# and #k'_1# are bridges of #ccK_1#, we have for every bridge #k_2# of #ccK_2# and term #t# of the appropriate sorts #(k_2, t(t'(bbx)), k_1)(t_1) ~~ (k_2, t(s'(bbx)), k'_1)(s_1)# and similarly for equations involving the bridges of #ccK_2#.

This composition is associative… almost. Furthermore, the collages generated by theories, #ccK_ccT#, behave like identities to this composition… almost. It turns out these statements are true, but only up to isomorphism of theories. That is, #(ccK_3 @ ccK_2) @ ccK_1 ~= ccK_3 @ (ccK_2 @ ccK_1)# but is not equal.

To talk about isomorphism of theories we need the notion of…


An interpretation of a theory gives meaning to the syntax of a theory. There are two nearly identical notions of interpretation for us: interpretation (into sets) and interpretation into a theory. I’ll define them in parallel. An interpretation (into a theory), #ccI#, is a mapping, written #⟦-⟧^ccI# though the superscript will often be omitted, which maps sorts to sets (sorts) and function symbols to functions (terms). The mapping satisfies:

#⟦"src"(f)⟧ = "src"(⟦f⟧)# and #⟦"tgt"(f)⟧ = "tgt"(⟦f⟧)# where #"src"# and #"tgt"# on the right are the domain and codomain operations for an interpretation.

We extend the mapping to a mapping on terms via:

  • #⟦bbx_A⟧ = x |-> x#, i.e. the identity function, or, for interpretation into a theory, #⟦bbx_A⟧ = bbx_{:⟦A⟧:}#
  • #⟦tt "f"(t)⟧ = ⟦tt "f"⟧ @ ⟦t⟧# or, for interpretation into a theory, #⟦tt "f"(t)⟧ = ⟦tt "f"⟧(⟦t⟧)#

and we require that for any equation of the theory, #l ~~ r#, #⟦l⟧ = ⟦r⟧#. (Technically, this is implicitly required for the extension of the mapping to terms to be well-defined, but it’s clearer to state it explicitly.) I’ll write #ccI : ccT -> bb "Set"# when #ccI# is an interpretation of #ccT# into sets, and #ccI’ : ccT_1 -> ccT_2# when #ccI’# is an interpretation of #ccT_1# into #ccT_2#.

An interpretation of the theory of isomorphisms produces a bijection between two specified sets. Spelling out a simple example where #bbbB# is the set of booleans:

  • #⟦A⟧ -= bbbB#
  • #⟦B⟧ -= bbbB#
  • #⟦tt "f"⟧ -= x |-> not x#
  • #⟦tt "g"⟧ -= x |-> not x#

plus the proof #not not x = x#.

As another simple example, we can interpret the theory of isomorphisms into itself slightly non-trivially.

  • #⟦A⟧ -= B#
  • #⟦B⟧ -= A#
  • #⟦tt "f"⟧ -= tt "g"(bbx_B)#
  • #⟦tt "g"⟧ -= tt "f"(bbx_A)#

As an (easy) exercise, you should define #pi_1 : ccT_1 xx ccT_2 -> ccT_1# and similarly #pi_2#. If you defined #ccT_1 + ccT_2# before, you should define #iota_1 : ccT_1 -> ccT_1 + ccT_2# and similarly for #iota_2#. As another easy exercise, show that an interpretation of #ccT_{:~=:}# is a bijection. In Haskell, an interpretation of #ccT_bbbN# would effectively be foldNat. Something very interesting happens when you consider what an interpretation of the collage generated by a theory, #ccK_ccT#, is. Spell it out. In a different vein, you can show that a collage #ccK : ccT_1 ↛ ccT_2# and an interpretation #ccT_1^(op) xx ccT_2 -> bb "Set"# are essentially the same thing in the sense that each gives rise to the other.

Two theories are isomorphic if there exists interpretations #ccI_1 : ccT_1 -> ccT_2# and #ccI_2 : ccT_2 -> ccT_1# such that #⟦⟦A⟧^(ccI_1)⟧^(ccI_2) = A# and visa versa, and similarly for function symbols. In other words, each is interpretable in the other, and if you go from one interpretation and then back, you end up where you started. Yet another way to say this is that there is a one-to-one correspondence between sorts and terms of each theory, and this correspondence respects substitution.

As a crucially important example, the set of terms, #ccT(A, B)#, can be extended to an interpretation. In particular, for each sort #A#, #ccT(A, -) : ccT -> bb "Set"#. It’s action on function symbols is the following:

#⟦tt "f"⟧^(ccT(A, -)) -= t |-> tt "f"(t)#

We have, dually, #ccT(-, A) : ccT^(op) -> bb "Set"# with the following action:

#⟦tt "f"⟧^(ccT(-, A)) -= t |-> t(tt "f"(bbx_B))#

We can abstract from both parameters making #ccT(-, =) : ccT^(op) xx ccT -> bb "Set"# which, by an early exercise, can be shown to correspond with the collage #ccK_ccT#.

Via an abuse of notation, I’ll identify #ccT^(op)(A, -)# with #ccT(-, A)#, though technically we only have an isomorphism between the interpretations, and to talk about isomorphisms between interpretations we need the notion of…


The theories we’ve presented are (multi-sorted) universal algebra theories. Universal algebra allows us to specify a general notion of “homomorphism” that generalizes monoid homomorphism or group homomorphism or ring homomorphism or lattice homomorphism.

In universal algebra, the algebraic theory of groups consists of a single sort, a nullary operation, #1#, a binary operation, #*#, a unary operation, #tt "inv"#, and some equations which are unimportant for us. Operations correspond to our function symbols except that they’re are not restricted to being unary. A particular group is a particular interpretation of the algebraic theory of groups, i.e. it is a set and three functions into the set. A group homomorphism then is a function between those two groups, i.e. between the two interpretations, that preserves the operations. In a traditional presentation this would look like the following:

Say #alpha : G -> K# is a group homomorphism from the group #G# to the group #K# and #g, h in G# then:

  • #alpha(1_G) = 1_K#
  • #alpha(g *_G h) = alpha(g) *_K alpha(h)#
  • #alpha(tt "inv"_G(g)) = tt "inv"_K(alpha(g))#

Using something more akin to our notation, it would look like:

  • #alpha(⟦1⟧^G) = ⟦1⟧^K#
  • #alpha(⟦*⟧^G(g,h)) = ⟦*⟧^K(alpha(g), alpha(h))#
  • #alpha(⟦tt "inv"⟧^G(g)) = ⟦tt "inv"⟧^K(alpha(g))#

The #tt "inv"# case is the most relevant for us as it is unary. However, for us, a function symbol #ttf# may have a different source and target and so we made need a different function on each side of the equation. E.g. for #ttf : A -> B#, #alpha : ccI_1 -> ccI_2#, and #a in ⟦A⟧^(ccI_1)# we’d have:

#alpha_B(⟦tt "f"⟧^(ccI_1)(a)) = ⟦tt "f"⟧^(ccI_2)(alpha_A(a))#

So a homomorphism #alpha : ccI_1 -> ccI_2 : ccT -> bb "Set"# is a family of functions, one for each sort of #ccT#, that satisfies the above equation for every function symbol of #ccT#. We call the individual functions making up #alpha# components of #alpha#, and we have #alpha_A : ⟦A⟧^(ccI_1) -> ⟦A⟧^(ccI_2)#. The definition for an interpretation into a theory, #ccT_2#, is identical except the components of #alpha# are terms of #ccT_2# and #a# can be replaced with #bbx_(⟦A⟧^(ccI_1))#. Two interpretations are isomorphic if we have homomorphism #alpha : ccI_1 -> ccI_2# such that each component is a bijection. This is the same as requiring a homomorphism #beta : ccI_2 -> ccI_1# such that for each #A#, #alpha_A(beta_A(x)) = x# and #beta_A(alpha_A(x)) = x#. A similar statement can be made for interpretations into theories, just replace #x# with #bbx_(⟦A⟧)#.

Another way to look at homomorphisms is via collages. A homomorphism #alpha : ccI_1 -> ccI_2 : ccT -> bb "Set"# gives rise to an interpretation of the collage #ccK_ccT#. The interpretation #ccI_alpha : ccK_ccT -> bb "Set"# is defined by:

  • #⟦tt "inl"(A)⟧^(ccI_alpha) -= ⟦A⟧^(ccI_1)#
  • #⟦tt "inr"(A)⟧^(ccI_alpha) -= ⟦A⟧^(ccI_2)#
  • #⟦tt "inl"(ttf)⟧^(ccI_alpha) -= ⟦ttf⟧^(ccI_1)#
  • #⟦tt "inr"(ttf)⟧^(ccI_alpha) -= ⟦ttf⟧^(ccI_2)#
  • #⟦tt "id"_A⟧^(ccI_alpha) -= alpha_A#

The homomorphism law guarantees that it satisfies the equation on #tt "id"#. Conversely, given an interpretation of #ccK_ccT#, we have the homomorphism, #⟦tt "id"⟧ : ⟦tt "inl"(-)⟧ -> ⟦tt "inr"(-)⟧ : ccT -> bb "Set"#. and the equation on #tt "id"# is exactly the homomorphism law.


Consider a homomorphism #alpha : ccT(A, -) -> ccI#. The #alpha# needs to satisfy for every sort #B# and #C#, every function symbol #ttf : C -> D#, and every term #t : B -> C#:

#alpha_D(tt "f"(t)) = ⟦tt "f"⟧^ccI(alpha_C(t))#

Looking at this equation, the possibility of viewing it as a recursive “definition” leaps out suggesting that the action of #alpha# is completely determined by it’s action on the variables. Something like this, for example:

#alpha_D(tt "f"(tt "g"(tt "h"(bbx_A)))) = ⟦tt "f"⟧(alpha_C(tt "g"(tt "h"(bbx_A)))) = ⟦tt "f"⟧(⟦tt "g"⟧(alpha_B(tt "h"(bbx_A)))) = ⟦tt "f"⟧(⟦tt "g"⟧(⟦tt "h"⟧(alpha_A(bbx_A))))#

We can easily establish that there’s a one-to-one correspondence between the set of homomorphisms #ccT(A, -) -> ccI# and the elements of the set #⟦A⟧^ccI#. Given a homomorphism, #alpha#, we get an element of #⟦A⟧^ccI# via #alpha_A(bbx_A)#. Inversely, given an element #a in ⟦A⟧^ccI#, we can define a homomorphism #a^**# via:

  • #a_D^**(tt "f"(t)) -= ⟦tt "f"⟧^ccI(a_C^**(t))#
  • #a_A^**(bbx_A) -= a#

which clearly satisfies the condition on homomorphisms by definition. It’s easy to verify that #(alpha_A(bbx_A))^** = alpha# and immediately true that #a^**(bbx_A) = a# establishing the bijection.

We can state something stronger. Given any homomorphism #alpha : ccT(A, -) -> ccI# and any function symbol #ttg : A -> X#, we can make a new homomorphism #alpha * ttg : ccT(X, -) -> ccI# via the following definition:

#(alpha * ttg)(t) = alpha(t(tt "g"(bbx_A)))#

Verifying that this is a homomorphism is straightforward:

#(alpha * ttg)(tt "f"(t)) = alpha(tt "f"(t(tt "g"(bbx_A)))) = ⟦tt "f"⟧(alpha(t(tt "g"(bbx_A)))) = ⟦tt "f"⟧((alpha * ttg)(t))#

and like any homomorphism of this form, as we’ve just established, it is completely determined by it’s action on variables, namely #(alpha * ttg)_A(bbx_A) = alpha_X(tt "g"(bbx_A)) = ⟦tt "g"⟧(alpha_A(bbx_A))#. In particular, if #alpha = a^**#, then we have #a^** * ttg = (⟦tt "g"⟧(a))^**#. Together these facts establish that we have an interpretation #ccY : ccT -> bb "Set"# such that #⟦A⟧^ccY -= (ccT(A, -) -> ccI)#, the set of homomorphisms, and #⟦tt "g"⟧^ccY(alpha) -= alpha * tt "g"#. The work we did before established that we have homomorphisms #(-)(bbx) : ccY -> ccI# and #(-)^** : ccI -> ccY# that are inverses. This is true for all theories and all interpretations as at no point did we use any particular facts about them. This statement is the (dual form of the) Yoneda lemma. To get the usual form simply replace #ccT# with #ccT^(op)#. A particularly important and useful case (so useful it’s usually used tacitly) occurs when we choose #ccI = ccT(B,-)#, we get #(ccT(A, -) -> ccT(B, -)) ~= ccT(B, A)# or, choosing #ccT^(op)# everywhere, #(ccT(-, A) -> ccT(-, B)) ~= ccT(A, B)# which states that a term from #A# to #B# is equivalent to a homomorphism from #ccT(-, A)# to #ccT(-, B)#.

There is another result, dual in a different way, called the co-Yoneda lemma. It turns out it is a corollary of the fact that for a collage #ccK : ccT_1 ↛ ccT_2#, #ccK_(ccT_2) @ ccK ~= ccK# and the dual is just the composition the other way. To get (closer to) the precise result, we need to be able to turn an interpretation into a collage. Given an interpretation, #ccI : ccT -> bb "Set"#, we can define a collage #ccK_ccI : ccT_bb1 ↛ ccT# whose bridges from #1 -> A# are the elements of #⟦A⟧^ccI#. Given this, the co-Yoneda lemma is the special case, #ccK_ccT @ ccK_ccI ~= ccK_ccI#.

Note, that the Yoneda and co-Yoneda lemmas only apply to interpretations into sets as #ccY# involves the set of homomorphisms.


The Yoneda lemma suggests that the interpretations #ccT(A, -)# and #ccT(-, A)# are particularly important and this will be borne out as we continue.

We call an interpretation, #ccI : ccT^(op) -> bb "Set"# representable if #ccI ~= ccT(-, X)# for some sort #X#. We then say that #X# represents #ccI#. What this states is that every term of sort #X# corresponds to an element in one of the sets that make up #ccI#, and these transform appropriately. There’s clearly a particularly important element, namely the image of #bbx_X# which corresponds to an element in #⟦X⟧^ccI#. This element is called the universal element. The dual concept is, for #ccI : ccT -> bb "Set"#, #ccI# is co-representable if #ccI ~= ccT(X, -)#. We will also say #X# represents #ccI# in this case as it actually does when we view #ccI# as an interpretation of #(ccT^(op))^(op)#.

As a rather liberating exercise, you should establish the following result called parameterized representability. Assume we have theories #ccT_1# and #ccT_2#, and a family of sorts of #ccT_2#, #X#, and a family of interpretations of #ccT_2^(op)#, #ccI#, both indexed by sorts of #ccT_1#, such that for each #A in bb "sort"(ccT_1)#, #X_A# represents #ccI_A#, i.e. #ccI_A ~= ccT_2(-, X_A)#. Given all this, then there is a unique interpretation #ccX : ccT_1 -> ccT_2# and #ccI : ccT_1 xx ccT_2^(op) -> bb "Set"# where #⟦A⟧^(ccX) -= X_A# and #"⟦("A, B")⟧"^ccI -= ⟦B⟧^(ccI_A)# such that #ccI ~= ccT_2(=,⟦-⟧^ccX)#. To be a bit more clear, the right hand side means #(A, B) |-> ccT_2(B, ⟦A⟧^ccX)#. Simply by choosing #ccT_1# to be a product of multiple theories, we can generalize this result to an arbitrary number of parameters. What makes this result liberating is that we just don’t need to worry about the parameters, they will automatically transform homomorphically. As a technical warning though, since two interpretations may have the same action on sorts but a different action on function symbols, if the family #X_A# was derived from an interpretation #ccJ#, i.e. #X_A -= ⟦A⟧^ccJ#, it may not be the case that #ccX = ccJ#.

Let’s look at some examples.

As a not-so-special case of representability, we can consider #ccI -= ccK(tt "inl"(-), tt "inr"(Z))# where #ccK : ccT_1 ↛ ccT_2#. Saying that #A# represents #ccI# in this case is saying that bridging terms of sort #tt "inr"(Z)#, i.e. sort #Z# in #ccT_2#, in #ccK#, correspond to terms of sort #A# in #ccT_1#. We’ll call the universal element of this representation the universal bridge (though technically it may be a bridging term, not a bridge). Let’s write #varepsilon# for this universal bridge. What representability states in this case is given any bridging term #k# of sort #Z#, there exists a unique term #|~ k ~|# of sort #A# such that #k = varepsilon(|~ k ~|)#. If we have an interpretation #ccX : ccT_2 -> ccT_1# such that #⟦Z⟧^ccX# represents #ccK(tt "inl"(-), tt "inr"(Z))# for each sort #Z# of #ccT_2# we say we have a right representation of #ccK#. Note, that the universal bridges become a family #varepsilon_Z : ⟦Z⟧^ccX -> Z#. Similarly, if #ccK(tt "inl"(A), tt "inr"(-))# is co-representable for each #A#, we say we have a left representation of #ccK#. The co-universal bridge is then a bridging term #eta_A : A -> ⟦A⟧# such that for any bridging term #k# with source #A#, there exists a unique term #|__ k __|# in #ccT_2# such that #k = |__ k __|(eta_A)#. For reference, we’ll call these equations universal properties of the left/right representation. Parameterized representability implies that a left/right representation is essentially unique.

Define #ccI_bb1# via #⟦A⟧^(ccI_bb1) -= bb1# where #bb1# is some one element set. #⟦ttf⟧^(ccI_bb1)# is the identity function for all function symbols #ttf#. We’ll say a theory #ccT# has a unit sort or has a terminal sort if there is a sort that we’ll also call #bb1# that represents #ccI_bb1#. Spelling out what that means, we first note that there is nothing notable about the universal element as it’s the only element. However, writing the homomorphism #! : ccI_bb1 -> ccT(-, bb1)# and noting that since there’s only one element of #⟦A⟧^(ccI_bb1)# we can, with a slight abuse of notation, also write the term #!# picks out as #!# which gives the equation:

#!_B(tt "g"(t)) = !_A(t)# for any function symbol #ttg : A -> B# and term, #t#, of sort #A#, note #!_A : A -> bb1#.

This equation states what the isomorphism also fairly directly states: there is exactly one term of sort #bb1# from any sort #A#, namely #!_A(bbx_A)#. The dual notion is called a void sort or an initial sort and will usually be notated #bb0#, the analog of #!# will be written as #0#. The resulting equation is:

#tt "f"(0_A) = 0_B# for any function symbol #ttf : A -> B#, note #0_A : bb0 -> A#.

For the next example, I’ll leverage collages. Consider the collage #ccK_2 : ccT ↛ ccT xx ccT# whose bridges from #A -> (B, C)# consist of pairs of terms #t_1 : A -> B# and #t_2 : A -> C#. #ccT# has pairs if #ccK_2# has a right representation. We’ll write #(B, C) |-> B xx C# for the representing interpretation’s action on sorts. We’ll write the universal bridge as #(tt "fst"(bbx_(B xx C)), tt "snd"(bbx_(B xx C)))#. The universal property then looks like #(tt "fst"(bbx_(B xx C)), tt "snd"(bbx_(B xx C)))((: t_1, t_2 :)) = (t_1, t_2)# where #(: t_1, t_2 :) : A -> B xx C# is the unique term induced by the bridge #(t_1, t_2)#. The universal property implies the following equations:

  • #(: tt "fst"(bbx_(B xx C)), tt "snd"(bbx_(B xx C))) = bbx_(B xx C)#
  • #tt "fst"((: t_1, t_2 :)) = t_1#
  • #tt "snd"((: t_1, t_2 :)) = t_2#

One aspect of note, is regardless of whether #ccK_2# has a right representation, i.e. regardless of whether #ccT# has pairs, it always has a left representation. The co-universal bridge is #(bbx_A, bbx_A)# and the unique term #|__(t_1, t_2)__|# is #tt "inl"(t_1, bbx_A)(tt "inr"(bbx_A, t_2)(bbx_("("A,A")")))#.

Define an interpretation #Delta : ccT -> ccT xx ccT# so that #⟦A⟧^Delta -= (A,A)# and similarly for function symbols. #Delta# left represents #ccK_2#. If the interpretation #(B,C) |-> B xx C# right represents #ccK_2#, then we say we have an adjunction between #Delta# and #(- xx =)#, written #Delta ⊣ (- xx =)#, and that #Delta# is left adjoint to #(- xx =)#, and conversely #(- xx =)# is right adjoint #Delta#.

More generally, whenever we have the situation #ccT_1(⟦-⟧^(ccI_1), =) ~= ccT_2(-, ⟦=⟧^(ccI_2))# we say that #ccI_1 : ccT_2 -> ccT_1# is left adjoint to #ccI_2 : ccT_1 -> ccT_2# or conversely that #ccI_2# is right adjoint to #ccI_1#. We call this arrangement an adjunction and write #ccI_1 ⊣ ccI_2#. Note that we will always have this situation if #ccI_1# left represents and #ccI_2# right represents the same collage. As we noted above, parameterized representability actually determines one adjoint given (its action on sorts and) the other adjoint. With this we can show that adjoints are unique up to isomorphism, that is, given two left adjoints to an interpretation, they must be isomorphic. Similarly for right adjoints. This means that stating something is a left or right adjoint to some other known interpretation essentially completely characterizes it. One issue with adjunctions is that they tend to be wholesale. Let’s say the pair sort #A xx B# existed but no other pair sorts existed, then the (no longer parameterized) representability approach would work just fine, but the adjunction would no longer exist.

Here’s a few of exercises using this. First, a moderately challenging one (until you catch the pattern): spell out the details to the left adjoint to #Delta#. We say a theory has sums and write those sums as #A + B# if #(- + =) ⊣ Delta#. Recast void and unit sorts using adjunctions and/or left/right representations. As a terminological note, we say a theory has finite products if it has unit sorts and pairs. Similarly, a theory has finite sums or has finite coproducts if it has void sorts and sums. An even more challenging exercise is the following: a theory has exponentials if it has finite products and for every sort #A#, #(A xx -) ⊣ (A => -)# (note, parameterized representability applies to #A#). Spell out the equations characterizing #A => B#.

Finite Product Theories

Finite products start to lift us off the ground. So far the theories we’ve been working with have been extremely basic: a language with only unary functions, all terms being just a sequence of applications of function symbols. It shouldn’t be underestimated though. It’s more than enough to do monoid and group theory. A good amount of graph theory can be done with just this. And obviously we were able to establish several general results assuming only this structure. Nevertheless, while we can talk about specific groups, say, we can’t talk about the theory of groups. Finite products change this.

A theory with finite products allows us to talk about multi-ary function symbols and terms by considering unary function symbols from products. This allows us to do all of universal algebra. For example, the theory of groups, #ccT_(bb "Grp")#, consists of a sort #S# and all it’s products which we’ll abbreviate as #S^n# with #S^0 -= bb1# and #S^(n+1) -= S xx S^n#. It has three function symbols #tte : bb1 -> S#, #ttm : S^2 -> S#, and #tti : S -> S# plus the ones that having finite products requires. In fact, instead of just heaping an infinite number of sorts and function symbols into our theory — and we haven’t even gotten to equations — let’s define a compact set of data from which we can generate all this data.

A signature, #Sigma#, consists of a collection of sorts, #sigma#, a collection of multi-ary function symbols, and a collection of equations. Equations still remain pairs of terms, but we need to now extend our definition of terms for this context. A term (in a signature) is either a variable, #bbx_i^[A_0,A_1,...,A_n]# with #A_i# are sorts and #0 <= i <= n#, the operators #tt "fst"# or #tt "snd"# applied to a term, the unit term written #(::)^A# with sort #A#, a pair of terms written #(: t_1, t_2 :)#, or the (arity correct) application of a multi-ary function symbol to a series of terms, e.g. #tt "f"(t_1, t_2, t_3)#. As a Haskell data declaration, it might look like:

data SigTerm = SigVar [Sort] Int | Fst SigTerm | Snd SigTerm | Unit Sort | Pair SigTerm SigTerm | SigApply FunctionSymbol [SigTerm]

At this point, sorting (i.e. typing) the terms is no longer trivial, though it is still pretty straightforward. Sorts are either #bb1#, or #A xx B# for sorts #A# and #B#, or a sort #A in sigma#. The source of function symbols or terms are lists of sorts.

  • #bbx_i^[A_0, A_1, ..., A_n] : [A_0, A_1, ..., A_n] -> A_i#
  • #(::)^A : [A] -> bb1#
  • #(: t_1, t_2 :) : bar S -> T_1 xx T_2# where #t_i : bar S -> T_i#
  • #tt "fst"(t) : bar S -> T_1# where #t : bar S -> T_1 xx T_2#
  • #tt "snd"(t) : bar S -> T_2# where #t : bar S -> T_1 xx T_2#
  • #tt "f"(t_1, ..., t_n) : bar S -> T# where #t_i : bar S -> T_i# and #ttf : [T_1,...,T_n] -> T#

The point of a signature was to represent a theory so we can compile a term of a signature into a term of a theory with finite products. The theory generated from a signature #Sigma# has the same sorts as #Sigma#. The equations will be equations of #Sigma#, with the terms compiled as will be described momentarily, plus for every pair of sorts the equations that describe pairs and the equations for #!#. Finally, we need to describe how to take a term of the signature and make a function symbol of the theory, but before we do that we need to explain how to convert those sources of the terms which are lists. That’s just a conversion to right nested pairs, #[A_0,...,A_n] |-> A_0 xx (... xx (A_n xx bb1) ... )#. The compilation of a term #t#, which we’ll write as #ccC[t]#, is defined as follows:

  • #ccC[bbx_i^[A_0, A_1, ..., A_n]] = tt "snd"^i(tt "fst"(bbx_(A_i xx(...))))# where #tt "snd"^i# means the #i#-fold application of #tt "snd"#
  • #ccC[(::)^A] = !_A#
  • #ccC[(: t_1, t_2 :)] = (: ccC[t_1], ccC[t_2] :)#
  • #ccC[tt "fst"(t)] = tt "fst"(ccC[t])#
  • #ccC[tt "snd"(t)] = tt "snd"(ccC[t])#
  • #ccC[tt "f"(t_1, ..., t_n)] = tt "f"((: ccC[t_1], (: ... , (: ccC[t_n], ! :) ... :) :))#

As you may have noticed, the generated theory will have an infinite number of sorts, an infinite number of function symbols, and an infinite number of equations no matter what the signature is — even an empty one! Having an infinite number of things isn’t a problem as long as we can algorithmically describe them and this is what the signature provides. Of course, if you’re a (typical) mathematician you nominally don’t care about an algorithmic description. Besides being compact, signatures present a nicer term language. The theories are like a core or assembly language. We could define a slightly nicer variation where we keep a context and manage named variables leading to terms-in-context like:

#x:A, y:B |-- tt "f"(x, x, y)#

which is

#tt "f"(bbx_0^[A,B], bbx_0^[A,B], bbx_1^[A,B])#

for our current term language for signatures. Of course, compilation will be (slightly) trickier for the nicer language.

The benefit of having compiled the signature to a theory, in addition to being able to reuse the results we’ve established for theories, is we only need to define operations on the theory, which is simpler since we only need to deal with pairs and unary function symbols. One example of this is we’d like to extend our notion of interpretation to one that respects the structure of the signature, and we can do that by defining an interpretation of theories that respects finite products.

A finite product preserving interpretation (into a finite product theory), #ccI#, is an interpretation (into a finite product theory) that additionally satisfies:

  • #⟦bb1⟧^ccI = bb1#
  • #⟦A xx B⟧^ccI = ⟦A⟧^ccI xx ⟦B⟧^ccI#
  • #⟦!_A⟧^ccI = !_(⟦A⟧^ccI)#
  • #⟦tt "fst"(t)⟧^ccI = tt "fst"(⟦t⟧^ccI)#
  • #⟦tt "snd"(t)⟧^ccI = tt "snd"(⟦t⟧^ccI)#
  • #⟦(: t_1, t_2 :)⟧^ccI = (: ⟦t_1⟧^ccI, ⟦t_2⟧^ccI :)#

where, for #bb "Set"#, #bb1 -= {{}}#, #xx# is the cartesian product, #tt "fst"# and #tt "snd"# are the projections, #!_A -= x |-> \{\}#, and #(: f, g :) -= x |-> (: f(x), g(x) :)#.

With signatures, we can return to our theory, now signature, of groups. #Sigma_bb "Grp"# has a single sort #S#, three function symbols #tte : [bb1] -> S#, #tti : [S] -> S#, and #ttm : [S, S] -> S#, with the following equations (written as equations rather than pairs):

  • #tt "m"(tt "e"((::)^S), bbx_0^S) = bbx_0^S#
  • #tt "m"(tt "i"(bbx_0^S), bbx_0^S) = tt "e"((::)^S)#
  • #tt "m"(tt "m"(bbx_0^[S,S,S], bbx_1^[S,S,S]), bbx_2^[S,S,S]) = tt "m"(bbx_0^[S,S,S], tt "m"(bbx_1^[S,S,S], bbx_2^[S,S,S]))#

or using the nicer syntax:

  • #x:S |-- tt "m"(tt "e"(), x) = x#
  • #x:S |-- tt "m"(tt "i"(x), x) = tt "e"()#
  • #x:S, y:S, z:S |-- tt "m"(tt "m"(x, y), z) = tt "m"(x, tt "m"(y, z))#

An actual group is then just a finite product preserving interpretation of (the theory generated by) this signature. All of universal algebra and much of abstract algebra can be formulated this way.

The Simply Typed Lambda Calculus and Beyond

We can consider additionally assuming that our theory has exponentials. I left articulating exactly what that means as an exercise, but the upshot is we have the following two operations:

For any term #t : A xx B -> C#, we have the term #tt "curry"(t) : A -> C^B#. We also have the homomorphism #tt "app"_(AB) : B^A xx A -> B#. They satisfy:

  • #tt "curry"(tt "app"(bbx_(B^A xx A))) = bbx_(B^A)#
  • #tt "app"((: tt "curry"(t_1), t_2 :)) = t_1((: bbx_A, t_2 :))# where #t_1 : A xx B -> C# and #t_2 : A -> B#.

We can view these, together with the the product operations, as combinators and it turns out we can compile the simply typed lambda calculus into the above theory. This is exactly what the Categorical Abstract Machine did. The “Caml” in “O’Caml” stands for “Categorical Abstract Machine Language”, though O’Caml no longer uses the CAM. Conversely, every term of the theory can be expressed as a simply typed lambda term. This means we can view the simply typed lambda calculus as just a different presentation of the theory.

At this point, this presentation of category theory starts to connect to the mainstream categorical literature on universal algebra, internal languages, sketches, and internal logic. This page gives a synopsis of the relationship between type theory and category theory. For some reason, it is unusual to talk about the internal language of a plain category, but that is exactly what we’ve done here.

I haven’t talked about finite limits or colimits beyond products and coproducts, nor have I talked about even the infinitary versions of products and coproducts, let alone arbitrary limits and colimits. These can be handled the same way as products and coproducts. Formulating a language like signatures or the simply typed lambda calculus is a bit more complicated, but not that hard. I may make a follow-up article covering this among other things. I also have a side project (don’t hold your breath), that implements the internal language of a category with finite limits. The result looks roughly like a simple version of an algebraic specification language like the OBJ family. The RING theory described in the Maude manual gives an idea of what it would look like. In fact, here’s an example of the current actual syntax I’m using.3

theory Categories type O type A given src : A -> O given tgt : A -> O given id : O -> A satisfying o:O | src (id o) = o, tgt (id o) = o given c : { f:A, g:A | src f = tgt g } -> A satisfying (f, g):{ f:A, g:A | src f = tgt g } | tgt (c (f, g)) = tgt f, src (c (f, g)) = src g satisfying "left unit" (o, f):{ o:O, f:A | tgt f = o } | c (id o, f) = f satisfying "right unit" (o, f):{ o:O, f:A | src f = o } | c (f, id o) = f satisfying "associativity" (f, g, h):{ f:A, g:A, h:A | src f = tgt g, src g = tgt h } | c (c (f, g), h) = c (f, c (g, h)) endtheory

It turns out this is a particularly interesting spot in the design space. The fact that the theory of theories with finite limits is itself a theory with finite limits has interesting consequences. It is still relatively weak though. For example, it’s not possible to describe the theory of fields in this language.

There are other directions one could go. For example, the internal logic of monoidal categories is (a fragment of) ordered linear logic. You can cross this bridge either way. You can look at different languages and consider what categorical structure is needed to support the features of the language, or you can add features to the category and see how that impacts the internal language. The relationship is similar to the source language and a core/intermediate language in a compiler, e.g. GHC Haskell and System Fc.


If you’ve looked at category theory at all, you can probably make most of the connections without me telling you. The table below outlines the mapping, but there are some subtleties. First, as a somewhat technical detail, my definition of a theory corresponds to a small category, i.e. a category which has a set of objects and a set of arrows. For more programmer types, you should think of “set” as Set in Agda, i.e. similar to the * kind in Haskell. Usually “category” means “locally small category” which may have a proper class of objects and between any two objects a set of arrows (though the union of all those sets may be a proper class). Again, for programmers, the distinction between “class” and “set” is basically the difference between Set and Set1 in Agda.4 To make my definition of theory closer to this, all that is necessary is instead of having a set of function symbols, have a family of sets indexed by pairs of objects. Here’s what a partial definition in Agda of the two scenarios would look like:

-- Small category (the definition I used) record SmallCategory : Set1 where field objects : Set arrows : Set src : arrows -> objects tgt : arrows -> objects ... -- Locally small category record LocallySmallCategory : Set2 where field objects : Set1 hom : objects -> objects -> Set ... -- Different presentation of a small category record SmallCategory' : Set1 where field objects : Set hom : objects -> objects -> Set ...

The benefit of the notion of locally small category is that Set itself is a locally small category. The distinction I was making between interpretations into theories and interpretations into Set was due to the fact that Set wasn’t a theory. If I used a definition theory corresponding to a locally small category, I could have combined the notions of interpretation by making Set a theory. The notion of a small category, though, is still useful. Also, an interpretation into Set corresponds to the usual notion of a model or semantics, while interpretations into other theories was a less emphasized concept in traditional model theory and universal algebra.

A less technical and more significant difference is that my definition of a theory doesn’t correspond to a category, but rather to a presentation of a category, from which a category can be generated. The analog of arrows in a category is terms, not function symbols. This is a bit more natural route from the model theory/universal algebra/programming side. Similarly, having an explicit collection of equations, rather than just an equivalence relation on terms is part of the presentation of the category but not part of the category itself.

model theory category theory sort object term arrow function symbol generating arrow theory presentation of a (small) category collage collage, cograph of a profunctor bridge heteromorphism signature presentation of a (small) category with finite products interpretation into sets, aka models a functor into Set, a (co)presheaf interpretation into a theory functor homomorphism natural transformation simply typed lambda calculus (with products) a cartesian closed category Conclusion

In some ways I’ve stopped just when things were about to get good. I may do a follow-up to elaborate on this good stuff. Some examples are: if I expand the definition so that Set becomes a “theory”, then interpretations also form such a “theory”, and these are often what we’re really interested in. The category of finite-product preserving interpretations of the theory of groups essentially is the category of groups. In fact, universal algebra is, in categorical terms, just the study of categories with finite products and finite-product preserving functors from them, particularly into Set. It’s easy to generalize this in many directions. It’s also easy to make very general definitions, like a general definition of a free algebraic structure. In general, we’re usually more interested in the interpretations of a theory than the theory itself.

While I often do advocate thinking in terms of internal languages of categories, I’m not sure that it is a preferable perspective for the very basics of category theory. Nevertheless, there are a few reasons for why I wrote this. First, this very syntactical approach is, I think, more accessible to someone coming from a programming background. From this view, a category is a very simple programming language. Adding structure to the category corresponds to adding features to this programming language. Interpretations are denotational semantics.

Another aspect about this presentation that is quite different is the use and emphasis on collages. Collages correspond to profunctors, a crucially important and enabling concept that is rarely covered in categorical introductions. The characterization of profunctors as collages in Vaughn Pratt’s paper (not using that name) was one of the things I enjoyed about that paper and part of what prompted me to start writing this. In earlier, drafts of this article, I was unable to incorporate collages in a meaningful way as I was trying to start from profunctors. This approach just didn’t add value. Collages just looked like a bizarre curio and weren’t integrated into the narrative at all. For other reasons, though, I ended up revisiting the idea of a heteromorphism. My (fairly superficial) opinion is that once you have the notion of functors and natural transformations, adding the notion of heteromorphisms has a low power-to-weight ratio, though it does make some things a bit nicer. Nevertheless, in thinking of how best to fit them into this context, it was clear that collages provided the perfect mechanism (which isn’t a big surprise), and the result works rather nicely. When I realized a fact that can be cryptically but compactly represented as #ccK_ccT ≃ bbbI xx ccT# where #bbbI# is the interval category, i.e. two objects with a single arrow joining them, I realized that this is actually an interesting perspective. Since most of this article was written at that point, I wove collages into the narrative replacing some things. If, though, I had started with this perspective from the beginning I suspect I would have made a significantly different article, though the latter sections would likely be similar.

  1. It’s actually better to organize this as a family of collections of function symbols indexed by pairs of sorts.

  2. Instead of having equations that generate an equivalence relation on (raw) terms, we could simply require an equivalence relation on (raw) terms be directly provided.

  3. Collaging is actually quite natural in this context. I already intend to support one theory importing another. A collage is just a theory that imports two others and then adds function symbols between them.

  4. For programmers familiar with Agda, at least, if you haven’t made this connection, this might help you understand and appreciate what a “class” is versus a “set” and what “size issues” are, which is typically handled extremely vaguely in a lot of the literature.

Categories: Offsite Blogs

Mark Jason Dominus: Imaginary Albanian eggplant festivals… IN SPACE

Planet Haskell - Thu, 11/24/2016 - 12:04pm

Wikipedia has a list of harvest festivals which includes this intriguing entry:

Ysolo: festival marking the first day of harvest of eggplants in Tirana, Albania

(It now says “citation needed“; I added that yesterday.)

I am confident that this entry, inserted in July 2012 by an anonymous user, is a hoax. When I first read it, I muttered “Oh, what bullshit,” but then went looking for a reliable source, because you never know. I have occasionally been surprised in the past, but this time I found clear evidence of a hoax: There are only a couple of scattered mentions of Ysolo on a couple of blogs, all from after 2012, and nothing at all in Google Books about Albanian eggplant celebrations. Nor is there an article about it in Albanian Wikipedia.

But reality gave ground before I arrived on the scene. Last September NASA's Dawn spacecraft visited the dwarf planet Ceres. Ceres is named for the Roman goddess of the harvest, and so NASA proposed harvest-related names for Ceres’ newly-discovered physical features. It appears that someone at NASA ransacked the Wikipedia list of harvest festivals without checking whether they were real, because there is now a large mountain at Ceres’ north pole whose official name is Ysolo Mons, named for this spurious eggplant festival. (See also: NASA JPL press release; USGS Astrogeology Science Center announcement.)

To complete the magic circle of fiction, the Albanians might begin to celebrate the previously-fictitious eggplant festival. (And why not? Eggplants are lovely.) Let them do it for a couple of years, and then Wikipedia could document the real eggplant festival… Why not fall under the spell of Tlön and submit to the minute and vast evidence of an ordered planet?

Happy Ysolo, everyone.

[ Addendum 20161208: Ysolo has been canceled ]

Categories: Offsite Blogs

Dominic Steinitz: Mercator: A Connection with Torsion

Planet Haskell - Thu, 11/24/2016 - 11:04am

In most presentations of Riemannian geometry, e.g. O’Neill (1983) and Wikipedia, the fundamental theorem of Riemannian geometry (“the miracle of Riemannian geometry”) is given: that for any semi-Riemannian manifold there is a unique torsion-free metric connection. I assume partly because of this and partly because the major application of Riemannian geometry is General Relativity, connections with torsion are given little if any attention.

It turns out we are all very familiar with a connection with torsion: the Mercator projection. Some mathematical physics texts, e.g. Nakahara (2003), allude to this but leave the details to the reader. Moreover, this connection respects the metric induced from Euclidean space.

We use SageManifolds to assist with the calculations. We hint at how this might be done more slickly in Haskell.

A Cartographic Aside %matplotlib inline /Applications/SageMath/local/lib/python2.7/site-packages/traitlets/ DeprecationWarning: A parent of InlineBackend._config_changed has adopted the new @observe(change) API clsname, change_or_name), DeprecationWarning) import matplotlib import numpy as np import matplotlib.pyplot as plt import cartopy import as ccrs from cartopy.mpl.ticker import LongitudeFormatter, LatitudeFormatter plt.figure(figsize=(8, 8)) ax = plt.axes( ax.gridlines() ax.add_feature(cartopy.feature.LAND) ax.add_feature(cartopy.feature.COASTLINE)


We can see Greenland looks much broader at the North than in the middle. But if we use a polar projection (below) then we see this is not the case. Why then is the Mercator projection used in preference to e.g. the polar projection or the once controversial Gall-Peters – see here for more on map projections.

plt.figure(figsize=(8, 8)) bx = plt.axes( bx.set_extent([-180, 180, 90, 50], ccrs.PlateCarree()) bx.gridlines() bx.add_feature(cartopy.feature.LAND) bx.add_feature(cartopy.feature.COASTLINE)



This is written as an Jupyter notebook. In theory, it should be possible to run it assuming you have installed at least sage and Haskell. To publish it, I used

jupyter-nbconvert --to markdown Mercator.ipynb pandoc -s -t markdown+lhs -o Mercator.lhs \ --filter pandoc-citeproc --bibliography DiffGeom.bib BlogLiteratelyD --wplatex Mercator.lhs > Mercator.html

Not brilliant but good enough.

Some commands to jupyter to display things nicely.

%display latex viewer3D = 'tachyon' Warming Up With SageManifolds

Let us try a simple exercise: finding the connection coefficients of the Levi-Civita connection for the Euclidean metric on in polar co-ordinates.

Define the manifold.

N = Manifold(2, 'N',r'\mathcal{N}', start_index=1)

Define a chart and frame with Cartesian co-ordinates.

ChartCartesianN.<x,y> = N.chart() FrameCartesianN = ChartCartesianN.frame()

Define a chart and frame with polar co-ordinates.

ChartPolarN.<r,theta> = N.chart() FramePolarN = ChartPolarN.frame()

The standard transformation from Cartesian to polar co-ordinates.

cartesianToPolar = ChartCartesianN.transition_map(ChartPolarN, (sqrt(x^2 + y^2), arctan(y/x))) print(cartesianToPolar) Change of coordinates from Chart (N, (x, y)) to Chart (N, (r, theta)) print(latex(cartesianToPolar.display()))

cartesianToPolar.set_inverse(r * cos(theta), r * sin(theta)) Check of the inverse coordinate transformation: x == x y == y r == abs(r) theta == arctan(sin(theta)/cos(theta))

Now we define the metric to make the manifold Euclidean.

g_e = N.metric('g_e') g_e[1,1], g_e[2,2] = 1, 1

We can display this in Cartesian co-ordinates.


And we can display it in polar co-ordinates


Next let us compute the Levi-Civita connection from this metric.

nab_e = g_e.connection() print(latex(nab_e))

If we use Cartesian co-ordinates, we expect that . Only non-zero entries get printed.


Just to be sure, we can print out all the entries.


In polar co-ordinates, we get


Which we can rew-rewrite as

with all other entries being 0.

The Sphere

We define a 2 dimensional manifold. We call it the 2-dimensional (unit) sphere but it we are going to remove a meridian to allow us to define the desired connection with torsion on it.

S2 = Manifold(2, 'S^2', latex_name=r'\mathbb{S}^2', start_index=1) print(latex(S2))

To start off with we cover the manifold with two charts.

polar.<th,ph> = S2.chart(r'th:(0,pi):\theta ph:(0,2*pi):\phi'); print(latex(polar))

mercator.<xi,ze> = S2.chart(r'xi:(-oo,oo):\xi ze:(0,2*pi):\zeta'); print(latex(mercator))

We can now check that we have two charts.


We can then define co-ordinate frames.

epolar = polar.frame(); print(latex(epolar))

emercator = mercator.frame(); print(latex(emercator))

And define a transition map and its inverse from one frame to the other checking that they really are inverses.

xy_to_uv = polar.transition_map(mercator, (log(tan(th/2)), ph)) xy_to_uv.set_inverse(2*arctan(exp(xi)), ze) Check of the inverse coordinate transformation: th == 2*arctan(sin(1/2*th)/cos(1/2*th)) ph == ph xi == xi ze == ze

We can define the metric which is the pullback of the Euclidean metric on .

g = S2.metric('g') g[1,1], g[2,2] = 1, (sin(th))^2

And then calculate the Levi-Civita connection defined by it.

nab_g = g.connection() print(latex(nab_g.display()))

We know the geodesics defined by this connection are the great circles.

We can check that this connection respects the metric.


And that it has no torsion.

print(latex(nab_g.torsion().display())) 0 A New Connection

Let us now define an orthonormal frame.

ch_basis = S2.automorphism_field() ch_basis[1,1], ch_basis[2,2] = 1, 1/sin(th) e = S2.default_frame().new_frame(ch_basis, 'e') print(latex(e))

We can calculate the dual 1-forms.

dX = S2.coframes()[2] ; print(latex(dX))

print(latex((dX[1], dX[2])))

print(latex((dX[1][:], dX[2][:])))

In this case it is trivial to check that the frame and coframe really are orthonormal but we let sage do it anyway.

print(latex(((dX[1](e[1]).expr(), dX[1](e[2]).expr()), (dX[2](e[1]).expr(), dX[2](e[2]).expr()))))

Let us define two vectors to be parallel if their angles to a given meridian are the same. For this to be true we must have a connection with .

nab = S2.affine_connection('nabla', latex_name=r'\nabla') nab.add_coef(e)

Displaying the connection only gives the non-zero components.


For safety, let us check all the components explicitly.


Of course the components are not non-zero in other frames.



This connection also respects the metric .


Thus, since the Levi-Civita connection is unique, it must have torsion.


The equations for geodesics are

Explicitly for both variables in the polar co-ordinates chart.

We can check that and are solutions although sage needs a bit of prompting to help it.

t = var('t'); a = var('a') print(latex(diff(a * log(tan(t/2)),t).simplify_full()))

We can simplify this further by recalling the trignometric identity.

print(latex(sin(2 * t).trig_expand()))

print(latex(diff (a / sin(t), t)))

In the mercator co-ordinates chart this is

In other words: straight lines.

Reparametersing with we obtain

Let us draw such a curve.

R.<t> = RealLine() ; print(R) Real number line R print(dim(R)) 1 c = S2.curve({polar: [2*atan(exp(-t/10)), t]}, (t, -oo, +oo), name='c') print(latex(c.display()))


c.plot(chart=polar, aspect_ratio=0.1)


It’s not totally clear this is curved so let’s try with another example.

d = S2.curve({polar: [2*atan(exp(-t)), t]}, (t, -oo, +oo), name='d') print(latex(d.display()))

d.plot(chart=polar, aspect_ratio=0.2)


Now it’s clear that a straight line is curved in polar co-ordinates.

But of course in Mercator co-ordinates, it is a straight line. This explains its popularity with mariners: if you draw a straight line on your chart and follow that bearing or rhumb line using a compass you will arrive at the end of the straight line. Of course, it is not the shortest path; great circles are but is much easier to navigate.

c.plot(chart=mercator, aspect_ratio=0.1)


d.plot(chart=mercator, aspect_ratio=1.0)


We can draw these curves on the sphere itself not just on its charts.

R3 = Manifold(3, 'R^3', r'\mathbb{R}^3', start_index=1) cart.<X,Y,Z> = R3.chart(); print(latex(cart))

Phi = S2.diff_map(R3, { (polar, cart): [sin(th) * cos(ph), sin(th) * sin(ph), cos(th)], (mercator, cart): [cos(ze) / cosh(xi), sin(ze) / cosh(xi), sinh(xi) / cosh(xi)] }, name='Phi', latex_name=r'\Phi')

We can either plot using polar co-ordinates.

graph_polar = polar.plot(chart=cart, mapping=Phi, nb_values=25, color='blue') show(graph_polar, viewer=viewer3D)


Or using Mercator co-ordinates. In either case we get the sphere (minus the prime meridian).

graph_mercator = mercator.plot(chart=cart, mapping=Phi, nb_values=25, color='red') show(graph_mercator, viewer=viewer3D)


We can plot the curve with an angle to the meridian of

graph_c = c.plot(mapping=Phi, max_range=40, plot_points=200, thickness=2) show(graph_polar + graph_c, viewer=viewer3D)


And we can plot the curve at angle of to the meridian.

graph_d = d.plot(mapping=Phi, max_range=40, plot_points=200, thickness=2, color="green") show(graph_polar + graph_c + graph_d, viewer=viewer3D)



With automatic differentiation and symbolic numbers, symbolic differentiation is straigtforward in Haskell.

> import Data.Number.Symbolic > import Numeric.AD > > x = var "x" > y = var "y" > > test xs = jacobian ((\x -> [x]) . f) xs > where > f [x, y] = sqrt $ x^2 + y^2 ghci> test [1, 1] [[0.7071067811865475,0.7071067811865475]] ghci> test [x, y] [[x/(2.0*sqrt (x*x+y*y))+x/(2.0*sqrt (x*x+y*y)),y/(2.0*sqrt (x*x+y*y))+y/(2.0*sqrt (x*x+y*y))]]

Anyone wishing to take on the task of producing a Haskell version of sagemanifolds is advised to look here before embarking on the task.

Appendix A: Conformal Equivalence

Agricola and Thier (2004) shows that the geodesics of the Levi-Civita connection of a conformally equivalent metric are the geodesics of a connection with vectorial torsion. Let’s put some but not all the flesh on the bones.

The Koszul formula (see e.g. (O’Neill 1983)) characterizes the Levi-Civita connection

Being more explicit about the metric, this can be re-written as

Let be the Levi-Civita connection for the metric where . Following [Gadea2010] and substituting into the Koszul formula and then applying the product rule

Where as usual the vector field, for , is defined via .

Let’s try an example.

nab_tilde = S2.affine_connection('nabla_t', r'\tilde_{\nabla}') f = S2.scalar_field(-ln(sin(th)), name='f') for i in S2.irange(): for j in S2.irange(): for k in S2.irange(): nab_tilde.add_coef()[k,i,j] = \ nab_g(polar.frame()[i])(polar.frame()[j])(polar.coframe()[k]) + \ polar.frame()[i](f) * polar.frame()[j](polar.coframe()[k]) + \ polar.frame()[j](f) * polar.frame()[i](polar.coframe()[k]) + \ g(polar.frame()[i], polar.frame()[j]) * \ polar.frame()[1](polar.coframe()[k]) * cos(th) / sin(th) print(latex(nab_tilde.display()))

print(latex(nab_tilde.torsion().display())) 0 g_tilde = exp(2 * f) * g print(latex(g_tilde.parent()))


nab_g_tilde = g_tilde.connection() print(latex(nab_g_tilde.display()))

It’s not clear (to me at any rate) what the solutions are to the geodesic equations despite the guarantees of Agricola and Thier (2004). But let’s try a different chart.


In this chart, the geodesics are clearly straight lines as we would hope.


Agricola, Ilka, and Christian Thier. 2004. “The geodesics of metric connections with vectorial torsion.” Annals of Global Analysis and Geometry 26 (4): 321–32. doi:10.1023/B:AGAG.0000047509.63818.4f.

Nakahara, M. 2003. “Geometry, Topology and Physics.” Text 822: 173–204. doi:10.1007/978-3-642-14700-5.

O’Neill, B. 1983. Semi-Riemannian Geometry with Applications to Relativity, 103. Pure and Applied Mathematics. Elsevier Science.

Categories: Offsite Blogs

Yesod Web Framework: The New (Experimental) Yesod Development Server - Feedback Requested!

Planet Haskell - Wed, 11/23/2016 - 11:15pm

I'm guessing almost every Yesod user has - at some point - used the venerable yesod devel command, which launches a server which auto-recompiles your source code on any file changes. This has been a core part of the Yesod ecosystem for many years. Unfortunately, it's had to be far more complicated than I'd have liked:

  • Since it predates addDependentFile (good work Greg Weber on getting that in!), it has some pretty complex logic around guessing which external files (like Hamlet files) should force a recompile. (Adding support for addDependentFile to the current yesod devel is possible, but it's a non-trivial undertaking.)
  • In order to ensure a consistent set of dependencies, it does some real fancy footwork around intercepting arguments passed to ghc and linker executables.
  • In order to parse various files, it links against the ghc library, tying it to a specific compiler version. This makes things difficult for users (don't accidentally use yesod from GHC 7.10.3 with GHC 8.0.1!), and sometimes really painful for maintainers.

For a few months now, I've been meaning to greatly simplify yesod devel, but the maintenance burden finally gave me the excuse I needed to bite the bullet and do it. The result is a dramatic simplification of the code base. First I'd like to ask for user feedback, and then I'll discuss some of the details of implementation.

Please try it out!

Since this is such a big change, I'd really appreciate if others could give this a shot before I release it. There are two ways you can do this:

Method 1:

  • git clone --branch 1304-stack-based-devel yesod-new-devel
  • cd yesod-new-devel
  • stack install yesod-bin
  • From your project directory, run yesod devel. NOTE: do not use stack exec -- yesod devel, you want to use the newly globally installed executable, not the one from your snapshot!

Method 2:

  • Add the following to your stack.yaml file's packages list:

    - location: git: commit: f3fc735a25eb3d5c051c761b59070eb9a0e4e156 subdirs: - yesod-bin extra-dep: true
  • Likely: add the following to your stack.yaml file's extra-deps list:

    - say- - typed-process-
  • stack build yesod-bin
  • stack exec -- yesod devel

Use whichever method you feel most comfortable with. Please let me know both successes and failures, and then I'll try to get this rolled out. Comments would be great on the Github pull request. So far, in my limited testing, I've found that the new yesod devel runs faster than the current one, but that could very much be confirmation bias speaking.

Note: there are a few removed features in this update, please see the changelog.

How it works

The big change - as the branch name implies - was depending entirely on Stack for all of the heavy lifting. Stack already provides a --file-watch command to automatically recompile, and uses GHC's own addDependentFile information to track external file dependencies. This cuts out the vast majority of the complexity. There's no longer any need to depend on the ghc library, there's less Cabal library code involved (making cross-version support much simpler), and almost everything is handled by shelling out to external executables.

I also got to redo the concurrency aspects of this using my absolute favorite package in the world: async. The result is, in my opinion, very straightforward. I also leveraged some of the newer libraries I've worked on, like safe-exceptions, typed-process, and say.

The code is (finally) well commented, so you can jump in and look yourself. I've also added a decent README, and an example of using yesod devel with a non-Yesod project.

Categories: Offsite Blogs

Neil Mitchell: The Website Working Group (HWWG)

Planet Haskell - Wed, 11/23/2016 - 4:04pm
Haskell represents both a language and a user community - and moreover a fantastic community full of friends, fun, and deep technical debate. Unfortunately, in recent times the community has started to fracture, e.g. Cabal vs Stack, vs These divisions have risen above technical disagreements and at some points turned personal. The solution, shepherded by Simon Peyton Jones, and agreed to by both members of the committee and the maintainers of, is to form the Haskell Website Working Group (HWWG). The charter of the group is at the bottom of this post.

The goal of the Haskell Website Working Group is to make sure the Haskell website caters to the needs of Haskell programmers, particularly beginners. In doing so we hope to either combine or differentiate and, and give people clear recommendations of what "downloading Haskell" means. Concretely, we hope that either redirects to, or that ends up being used for something very different from today.

The Haskell Website Working Group (HWWG)Scope and goals
  • The HWWG is responsible for the design and content of the user-facing web site, including tutorials, videos, news, resource, downloads, etc.

  • The HWWG is not responsible for:
    • The infrastructure of
    • Toolchains, Hackage, compilers, etc
  • The HWWG focuses on serving users of Haskell, not suppliers of technology or libraries.
  • An explicit goal is to re-unite the and web sites.
Expected mode of operation
  • HWWG is not responsible for actually doing everything! The web site is on github. Anyone can make a pull request. The general expectation is that uncontroversial changes will be accepted and committed without much debate.
  • If there is disagreement about a proposed change, it's up to the HWWG to engage in (open) debate, and to seek input as widely as time allows, but finally to make a decision.

Initial membership comprises of:

  • Neil Mitchell (chair)
  • Nicolas Wu
  • Andrew Cowie
  • Vincent Hanquez
  • Ryan Trinkle
  • Chris Done

It is expected the committee will change over time, but the mechanism has not yet been thought about.

Rules of engagement
  • Recognising that honestly-held judgements may differ, we will be scrupulously polite both in public and in private.
  • Recognising that Haskell has many users, and that different users have different needs and tastes, we want to be inclusive rather than exclusive, providing a variety of alternative resources (toolchains, tutorials, books, etc) clearly signposted with their intended audiences.
  • Ultimately the committee owns the URL, but it delegates authority for the design and content of the web site to the HWWG. In extremis, if the committee believes that the HWWG is mismanaging the web site, it can revoke that delegation.
Categories: Offsite Blogs

Joachim Breitner: microG on Jolla

Planet Haskell - Wed, 11/23/2016 - 11:44am

I am a incorrigibly in picking non-mainstream, open smartphones, and then struggling hard. Back then in 2008, I tried to use the OpenMoko FreeRunner, but eventually gave up because of hardware glitches and reverted to my good old Siemens S35. It was not that I would not be willing to put up with inconveniences, but as soon as it makes live more difficult for the people I communicate with, it becomes hard to sustain.

Two years ago I tried again, and got myself a Jolla phone, running Sailfish OS. Things are much nicer now: The hardware is mature, battery live is good, and the Android compatibility layer enables me to run many important apps that are hard to replace, especially the Deutsche Bahn Navigator and various messengers, namely Telegram, Facebook Messenger, Threema and GroupMe.

Some apps that require Google Play Services, which provides a bunch of common tasks and usually comes with the Google Play store would not run on my phone, as Google Play is not supported on Sailfish OS. So far, the most annoying ones of that sort were Uber and Lyft, making me pay for expensive taxis when others would ride cheaper, but I can live with that. I tried to install Google Play Services from shady sources, but it would regularly crash.

Signal on Jolla

Now in Philadelphia, people urged me to use the Signal messenger, and I was convinced by its support for good end-to-end crypto, while still supporting offline messages and allowing me to switch from my phone to my desktop and back during a conversation. The official Signal app uses Google Cloud Messaging (GCM, part of Google Play Services) to get push updates about new posts, and while I do not oppose this use of Google services (it really is just a ping without any metadata), this is a problem on Sailfish OS.

Luckily, the Signal client is open source, and someone created a “LibreSignal” edition that replaced the use of GCM with websockets, and indeed, this worked on my phone, and I could communicate.

Things were not ideal, though: I would often have to restart the app to get newly received messages; messages that I send via Signal Desktop would often not show up on the phone and, most severe, basically after every three messages, sending more messages from Desktop would stop working for my correspondents, which freaked them out. (Strangely it continued working from their phone app, so we coped for a while.)

So again, my choice of non-standard devices causes inconveniences to others. This, and the fact that the original authors of Signal and the maintainers of LibreSignal got into a fight that ended LibreSignal discontinued, meant that I have to change something about this situation. I was almost ready to give in and get myself a Samsung S7 or something boring of the sort, but then I decided to tackle this issue once more, following some of the more obscure instructions out there, trying to get vanilla Signal working on my phone. About a day later, I got it, and this is how I did it.


So I need Google Play Services somehow, but installing the “real thing” did not seem to be very promising (I tried, and regularly got pop-ups telling me that Play Services has crashed.) But I found some references to a project called “microG”, which is an independent re-implementation of (some of) of the play services, in particular including GCM.

Installing microG itself was easy, as you can add their repository to F-Droid. I installed the core services, the services framework and the fake store apps. If this had been all that was to do, things would be easy!

Play Store detection work arounds

But Signal would still complain about the lack of Google Play Services. It asks Android if an app with a certain name is installed, and would refuse to work if this app does not exist. For some reason, the microG apps cannot just have the names of the “real” Google apps.

There seem to be two ways of working around this: Patching Signal, or enabling Signature Spoofing.

The initially most promising instructions (which are in a README in a tarball on a fishy file hoster linked from an answer on the Jolla support forum…) suggested patching Signal, and actually came both with a version of an app called “Lucky Patcher” as well as a patched Android package, but both about two years old. I tried a recent version of the Lucky Patcher, but it failed to patch the current version of Signal.

Signature Spoofing

So on to Signature Spoofing. This is a feature of some non-standard Android builds that allow apps (such as microG) to fake the existence of other apps (the Play Store), and is recommended by the microG project. Sailfish OS’s Android compatibility layer “Alien Dalvik” does not support it out of the box, but there is a tool “tingle” that adds this feature to existing Android systems. One just has to get the /system/framework/framework.jar file, put it into the input folder of this project, run python, select 2, and copy the framework.jar from output/ back. Great.

Deodexing Alien Dalvik

Only that it only works on “deodexed” files. I did not know anything about odexed Android Java classes (and did not really want to know), but there was not way around. Following this explanation I gathered that one finds files foo.odex in the Android system folder, runs some tool on them to create a classes.dex file, and adds that to the corresponding foo.jar or foo.apk file, copies this back to the phone and deletes the foo.odex file.

The annoying this is that one does not only have to do it for framework.jar in order to please tingle, because if one does it to one odex file, one has to do to all! It seems that for people using Windows, the Universal Deodexer V5 seems to be a convenient tool, but I had to go more manually.

So I first fetched “smali”, compiled it using ./gradlew build. Then I fetched the folders /opt/alien/system/framework and /opt/alien/system/app from the phone (e.g. using scp). Keep a backup of these in case something breaks. Then I ran these commands (disclaimer: I fetched these from my bash history and slightly cleaned them up. This is not a fire-and-forget script! Use it when you know what it and you are doing):

cd framework for file in *.odex do java -jar ~/build/smali/baksmali/build/libs/baksmali.jar deodex $file -o out java -jar ~/build/smali/smali/build/libs/smali.jar a out -o classes.dex zip -u $(basename $file .odex).jar classes.dex rm -rf out classes.dex $file done cd .. cd app for file in *.odex do java -jar ~/build/smali/baksmali/build/libs/baksmali.jar deodex -d ../framework $file -o out java -jar ~/build/smali/smali/build/libs/smali.jar a out -o classes.dex zip -u $(basename $file .odex).apk classes.dex rm -rf out classes.dex $file done cd ..

The resulting framework.jar can now be patched with tingle:

mv framework/framework.jar ~/build/tingle/input cd ~/build/tingle ./ # select 2 cd - mv ~/build/tingle/output/framework.jar framework/framework.jar

Now I copy these framework and app folders back on my phone, and restart Dalvik:

devel-su systemctl restart aliendalvik.service

It might start a bit slower than usually, but eventually, all the Android apps should work as before.

The final bit that was missing in my case was that I had to reinstall Signal: If it is installed before microG is installed, it does not get permission to use GCM, and when it tries (while registering: After generating the keys) it just crashes. I copied /data/data/org.thoughtcrime.secretsms/ before removing Signal and moved it back after (with cp -a to preserve permissions) so that I could keep my history.

And now, it seems, vanilla Signal is working just fine on my Jolla phone!

What’s missing

Am I completely happy with Signal? No! An important feature that it is lacking is a way to get out all data (message history including media files) in a file format that can be read without Signal; e.g. YAML files or clean HTML code. I do want to be able to re-read some of the more interesting conversations when I am 74 or 75, and I doubt that there will be a Signal App, or even Android, then. I hope that this becomes available in time, maybe in the Desktop version.

I would also hope that pidgin gets support to the Signal protocol, so that I conveniently use one program for all my messaging needs on the desktop.

Finally it would be nice if my Signal identity was less tied to one phone number. I have a German and a US phone number, and would want to be reachable under both on all my clients. (If you want to contact me on Signal, use my US phone number.)


Could I have avoided this hassle by simply convincing people to use something other than Signal? Tricky, at the moment. Telegram (which works super reliable for me, and has a pidgin plugin) has dubious crypto and does not support crypto while using multiple clients. Threema has no desktop client that I know of. OTR on top of Jabber does not support offline messages. So nothing great seems to exist right now.

In the long run, the best bet seems to be OMEMO (which is, in essence, the Signal protocol) on top of Jabber. It is currently supported by one Android Jabber client (Conversations) and one Desktop application (gajim, via a plugin). I should keep an eye on pidgin support for OMEMO and other development around this.

Categories: Offsite Blogs

Polymorphism, subtyping and type inference in MLsub

Lambda the Ultimate - Wed, 11/23/2016 - 10:54am

I am very enthusiastic about the following paper: it brings new ideas and solves a problem that I did not expect to be solvable, namely usable type inference when both polymorphism and subtyping are implicit. (By "usable" here I mean that the inferred types are both compact and principal, while previous work generally had only one of those properties.)

Polymorphism, Subtyping, and Type Inference in MLsub
Stephen Dolan and Alan Mycroft

We present a type system combining subtyping and ML-style parametric polymorphism. Unlike previous work, our system supports type inference and has compact principal types. We demonstrate this system in the minimal language MLsub, which types a strict superset of core ML programs.

This is made possible by keeping a strict separation between the types used to describe inputs and those used to describe outputs, and extending the classical unification algorithm to handle subtyping constraints between these input and output types. Principal types are kept compact by type simplification, which exploits deep connections between subtyping and the algebra of regular languages. An implementation is available online.

The paper is full of interesting ideas. For example, one idea is that adding type variables to the base grammar of types -- instead of defining them by their substitution -- forces us to look at our type systems in ways that are more open to extension with new features. I would recommend looking at this paper even if you are interested in ML and type inference, but not subtyping, or in polymorphism and subtyping, but not type inference, or in subtyping and type inference, but not functional languages.

This paper is also a teaser for the first's author PhD thesis, Algebraic Subtyping, currently not available online -- I suppose the author is not very good at the internet.

(If you are looking for interesting work on inference of polymorphism and subtyping in object-oriented languages, I would recommend Getting F-Bounded Polymorphism into Shape by Ben Greenman, Fabian Muehlboeck and Ross Tate, 2014.)

Categories: Offsite Discussion

Salon des Refusés -- Dialectics for new computer science

Lambda the Ultimate - Wed, 11/23/2016 - 10:20am

In the first decade of the twenty-first century, the Feyerabend Project organized several workshops to discuss and develop new ways to think of programming languages and computing in general. A new event in this direction is a new workshop that will take place in Brussels, in April, co-located with the new <Programming> conference -- also worth a look.

Salon des Refusés -- Dialectics for new computer science
April 2017, Brussels

Salon des Refusés (“exhibition of rejects”) was an 1863 exhibition of artworks rejected from the official Paris Salon. The jury of Paris Salon required near-photographic realism and classified works according to a strict genre hierarchy. Paintings by many, later famous, modernists such as Édouard Manet were rejected and appeared in what became known as the Salon des Refusés. This workshop aims to be the programming language research equivalent of Salon des Refusés. We provide venue for exploring new ideas and new ways of doing computer science.

Many interesting ideas about programming might struggle to find space in the modern programming language research community, often because they are difficult to evaluate using established evaluation methods (be it proofs, measurements or controlled user studies). As a result, new ideas are often seen as “unscientific”.

This workshop provides a venue where such interesting and thought provoking ideas can be exposed to critical evaluation. Submissions that provoke interesting discussion among the program committee members will be published together with an attributed review that presents an alternative position, develops additional context or summarizes discussion from the workshop. This means of engaging with papers not just enables explorations of novel programming ideas, but also enourages new ways of doing computer science.

The workshop's webpage also contains descriptions of of some formats that could "make it possible to think about programming in a new way", including: Thought experiments, Experimentation, Paradigms, Metaphors, myths and analogies, and From jokes to science fiction.

For writings on similar questions about formalism, paradigms or method in programming language research, see Richard Gabriel's work, especially The Structure of a Programming Language Revolution (2012) and Writers’ Workshops As Scientific Methodology (?)), Thomas Petricek's work, especially Against a Universal Definition of 'Type' (2015) and Programming language theory Thinking the unthinkable (2016)), and Jonathan Edwards' blog: Alarming Development.

For programs of events of similar inspiration in the past, you may be interested in the Future of Programming workshops: program of 2014, program of September 2015, program of October 2015. Other events that are somewhat similar in spirit -- but maybe less radical in content -- are Onward!, NOOL and OBT.

Categories: Offsite Discussion

Michael Snoyman: Haskell for Dummies

Planet Haskell - Tue, 11/22/2016 - 6:00pm

There was an image that made the rounds a while ago.

The joke being: haha, Haskell is only for super-geniuses like Einstein. There's lots to complain about in this chart, but I'm going to pick on the lower-right corner. Specifically:

Haskellers don't use Haskell because we think we're Einstein. We use Haskell because we know we aren't.

When I speak to Haskellers, the general consensus is: "I'm not smart enough to write robust code in a language like Python." We're not using Haskell because we're brilliant; we're using Haskell because we know we need a language that will protect us from ourselves.

That said, I should acknowledge that Haskell does have a steeper learning curve for most programmers. But this is mostly to do with unfamiliarity: Haskell is significantly different from languages like Python, Ruby, and Java, whereas by contrast those languages are all relatively similar to each other. Great educational material helps with this.

You should set your expectations appropriately: it will take you longer to learn Haskell, but it's worth it. Personally, I use Haskell because:

  • It gives me the highest degree of confidence that I'll write my program correctly, due to its strong, static typing
  • It has great support for modern programming techniques, like functional programming and green-thread-based concurrency
  • I can write more maintainable code in it than other languages
  • It has a great set of libraries and tools
  • It's got great performance characteristics for high-level code, and allows low-level performance tweaking when needed

I'm certainly leaving off a lot of points here, my goal isn't to be comprehensive. Instead, I'd like to dispel with this notion of the Haskeller super-genius. We Haskellers don't believe it. We know why we're using a language like Haskell: to protect us from ourselves.

Categories: Offsite Blogs

FP Complete: Scripting in Haskell

Planet Haskell - Mon, 11/21/2016 - 8:00pm

Writing scripts in Haskell using Stack is straight-forward and reliable. We've made a screencast to demonstrate this:

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="315" src="" width="560"></iframe>Summary

Slides in the screencast cover:

  • What Haskell is
  • What Stack is
  • We make a case for reproducible scripting

We cover the following example cases:

  • Hello, World! as a Haskell script
  • Generating a histogram of lines of a file
  • Using FSNotify to watch a directory for any changes to files

In summary, we show:

  1. Scripting in Haskell is reproducible: it works over time without bitrot.
  2. It's easy to do, just add a shebang #!/usr/bin/env stack to the top of your file.
  3. We can use useful, practical libraries that are not available in traditional scripting languages.
  4. Our script is cross-platform: it'll work on OS X, Linux or Windows.
Categories: Offsite Blogs

FP Complete: Comparative Concurrency with Haskell

Planet Haskell - Mon, 11/21/2016 - 8:00pm

Last week, I was at DevConTLV X and attended a workshop by Aaron Cruz. While the title was a bit of a lie (it wasn't four hours, and we didn't do a chat app), it was a great way to see some basics of concurrency in different languages. Of course, that made me all the more curious to add Haskell to the mix.

I've provided multiple different implementations of this program in Haskell, focusing on different approaches (matching the approaches of the other languages, highly composable code, and raw efficiency). These examples are intended to be run and experimented with. The only requirement is that you install the Haskell build tool Stack. You can download a Windows installer, or on OS X and Linux run:

curl -sSL | sh

We'll start with approaches very similar to other languages like Go and Rust, and then dive into techniques like Software Transactional Memory which provide a much improved concurrency experience for more advanced workflows. Finally we'll dive into the async library, which provides some very high-level functions for writing concurrent code in a robust manner.

Unfortunately I don't have access to the source code for the other languages right now, so I can't provide a link to it. If anyone has such code (or wants to write up some examples for other lanugages), please let me know so I can add a link.

The problem

We want to spawn a number of worker threads which will each sleep for a random period of time, grab an integer off of a shared work queue, square it, and put the result back on a result queue. Meanwhile, a master thread will fill up the work queue with integers, and read and print results.

Running the examples

Once you've installed Stack, you can save each code snippet into a file with a .hs extension (like concurrency.hs), and then run it with stack concurrency.hs. If you're on OS X or Linux, you can also:

chmod +x concurrency.hs ./concurrency.hs

The first run will take a bit longer as it downloads the GHC compiler and installs library dependencies, but subsequent runs will be able to use the cached results. You can read more about scripting with Haskell.


Most of the other language examples used some form of channels. We'll begin with a channel-based implementation for a convenient comparison to other language implementations. As you'll see, Haskell's channel-based concurrency is quite similar to what you'd experience in languages like Go and Elixir.

#!/usr/bin/env stack -- stack --install-ghc --resolver lts-6.23 runghc --package random import Control.Concurrent (forkIO, threadDelay, readChan, writeChan, newChan) import Control.Monad (forever) import System.Random (randomRIO) -- The following type signature is optional. Haskell has type -- inference, which makes most explicit signatures unneeded. We -- usually include them at the top level for easier reading. workerCount, workloadCount, minDelay, maxDelay :: Int workerCount = 250 workloadCount = 10000 minDelay = 250000 -- in microseconds, == 0.25 seconds maxDelay = 750000 -- == 0.75 seconds -- Launch a worker thread. We take in the request and response -- channels to communicate on, as well as the ID of this -- worker. forkIO launches an action in a new thread, and forever -- repeats the given action indefinitely. worker requestChan responseChan workerId = forkIO $ forever $ do -- Get a random delay value between the min and max delays delay <- randomRIO (minDelay, maxDelay) -- Delay this thread by that many microseconds threadDelay delay -- Read the next item off of the request channel int <- readChan requestChan -- Write the response to the response channel writeChan responseChan (workerId, int * int) main = do -- Create our communication channels requestChan <- newChan responseChan <- newChan -- Spawn off our worker threads. mapM_ performs the given action -- on each value in the list, which in this case is the -- identifiers for each worker. mapM_ (worker requestChan responseChan) [1..workerCount] -- Define a helper function to handle each integer in the workload let perInteger int = do -- Write the current item to the request channel writeChan requestChan int -- Read the result off of the response channel (workerId, square) <- readChan responseChan -- Print out a little message putStrLn $ concat [ "Worker #" , show workerId , ": square of " , show int , " is " , show square ] -- Now let's loop over all of the integers in our workload mapM_ perInteger [1..workloadCount]

This is a pretty direct translation of how you would do things in a language like Go or Erlang/Elixir. We've replaced for-loops with maps, but otherwise things are pretty similar.

There's a major limitation in this implementation, unfortunately. In the master thread, our perInteger function is responsible for providing requests to the workers. However, it will only provide one work item at a time and then block for a response. This means that we are effectively limiting ourselves to one concurrent request. We'll address this in various ways in the next few examples.


It turns out in this case, we can use a lighter-weight alternative to a channel for the requests. Instead, we can just put all of our requests into an IORef - which is the basic mutable variable type in Haskell - and then pop requests off of the list inside that variable. Veterans of concurrency bugs will be quick to point out the read/write race condition you'd usually expect:

  1. Thread A reads the list from the variable
  2. Thread B reads the list from the variable
  3. Thread A pops the first item off the list and writes the rest to the variable
  4. Thread B pops the first item off the list and writes the rest to the variable

In this scenario, both threads A and B will end up with the same request to work on, which is certainly not our desired behavior. However, Haskell provides built-in compare-and-swap functionality, allowing us to guarantee that our read and write are atomic operations. This only works for a subset of Haskell functionality (specifically, the pure subset which does not have I/O side effects), which fortunately our pop-an-element-from-a-list falls into. Let's see the code.

#!/usr/bin/env stack -- stack --install-ghc --resolver lts-6.23 runghc --package random import Control.Concurrent (forkIO, threadDelay, writeChan, readChan, newChan) import Data.IORef (atomicModifyIORef, newIORef) import Control.Monad (replicateM_) import System.Random (randomRIO) workerCount = 250 workloadCount = 10000 minDelay = 250000 maxDelay = 750000 worker requestsRef responseChan workerId = forkIO $ do -- Define a function to loop on the available integers in the -- requests reference. let loop = do delay <- randomRIO (minDelay, maxDelay) threadDelay delay -- atomicModifyIORef is our compare-and-swap function. We -- give it a reference, and a function that works on the -- contained value. That function returns a pair of the -- new value for the reference, and a return value. mint <- atomicModifyIORef requestsRef $ \requests -> -- Let's see if we have anything to work with... case requests of -- Empty list, so no requests! Put an empty list -- back in and return Nothing [] -> ([], Nothing) -- We have something. Put the tail of the list -- back in the reference, and return the new item. int:rest -> (rest, Just int) -- Now we'll see what to do next based on whether or not -- we got something from the requests reference. case mint of -- No more requests, stop looping Nothing -> return () -- Got one, so... Just int -> do -- Write the response to the response channel writeChan responseChan (workerId, int, int * int) -- And loop again loop -- Kick off the loop loop main = do -- Create our request reference and response channel requestsRef <- newIORef [1..workloadCount] responseChan <- newChan mapM_ (worker requestsRef responseChan) [1..workerCount] -- We know how many responses to expect, so ask for exactly that -- many with replicateM_. replicateM_ workloadCount $ do -- Read the result off of the response channel (workerId, int, square) <- readChan responseChan -- Print out a little message putStrLn $ concat [ "Worker #" , show workerId , ": square of " , show int , " is " , show square ]

Compare-and-swap operations can be significantly more efficient than using true concurrency datatypes (like the Chans we saw above, or Software Transactional Memory). But they are also limiting, and don't compose nicely. Use them when you need a performance edge, or have some other reason to prefer an IORef.

Compared to our channels example, there are some differences in behavior:

  • In the channels example, our workers looped forever, whereas here they have an explicit stop condition. In reality, the Haskell runtime will automatically kill worker threads that are blocked on a channel without any writer. However, we'll see how to use closable channels in a later example.
  • The channels example would only allow one request on the request channel at a time. This is similar to some of the examples from other languages, but defeats the whole purpose of concurrency: only one worker will be occupied at any given time. This IORef approach allows multiple workers to have work items at once. (Again, we'll see how to achieve this with channels in a bit.)

You may be concerned about memory usage: won't holding that massive list of integers in memory all at once be expensive? Not at all: Haskell is a lazy language, meaning that the list will be constructed on demand. Each time a new element is asked for, it will be allocated, and then can be immediately garbage collected.

Software Transactional Memory

One of the most powerful concurrency techniques available in Haskell is Software Transactional Memory (STM). It allows us to have mutable variables, and to make modifications to them atomically. For example, consider this little snippet from a theoretical bank account application:

transferFunds from to amt = atomically $ do fromOrig <- readTVar from toOrig <- readTVar to writeTVar from (fromOrig - amt) writeTVar to (toOrig + amt)

In typically concurrent style, this would be incredibly unsafe: it's entirely possible for another thread to modify the from or to bank account values between the time our thread reads and writes them. However, with STM, we are guaranteed atomicity. STM will keep a ledger of changes made during an atomic transaction, and then attempt to commit them all at once. If any of the variables references have modified during the transaction, the ledger will be rolled back and tried again. And like atomicModifyIORef from above, Haskell disallows side-effects inside a transaction, so that this retry behavior cannot be observed from the outside world.

To stress this point: Haskell's STM can eliminate the possibility for race conditions and deadlocks from many common concurrency patterns, greatly simplifying your code. The leg-up that Haskell has over other languages in the concurrency space is the ability to take something that looks like calamity and make it safe.

We're going to switch our channels from above to STM channels. For the request channel, we'll use a bounded, closable channel (TBMChan). Bounding the size of the channel prevents us from loading too many values into memory at once, and using a closable channel allows us to tell our workers to exit.

#!/usr/bin/env stack {- stack --install-ghc --resolver lts-6.23 runghc --package random --package stm-chans -} import Control.Concurrent (forkIO, threadDelay, readChan, writeChan, newChan) import Control.Concurrent.STM (atomically, writeTChan, readTChan, newTChan) import Control.Concurrent.STM.TBMChan (readTBMChan, writeTBMChan, newTBMChan, closeTBMChan) import Control.Monad (replicateM_) import System.Random (randomRIO) workerCount = 250 workloadCount = 10000 minDelay = 250000 -- in microseconds, == 0.25 seconds maxDelay = 750000 -- == 0.75 seconds worker requestChan responseChan workerId = forkIO $ do let loop = do delay <- randomRIO (minDelay, maxDelay) threadDelay delay -- Interact with the STM channels atomically toContinue <- atomically $ do -- Get the next request, if the channel is open mint <- readTBMChan requestChan case mint of -- Channel is closed, do not continue Nothing -> return False -- Channel is open and we have a request Just int -> do -- Write the response to the response channel writeTChan responseChan (workerId, int, int * int) -- And yes, please continue return True if toContinue then loop else return () -- Kick it off! loop main = do -- Create our communication channels. We're going to ensure the -- request channel never gets more than twice the size of the -- number of workers to avoid high memory usage. requestChan <- atomically $ newTBMChan (workerCount * 2) responseChan <- atomically newTChan mapM_ (worker requestChan responseChan) [1..workerCount] -- Fill up the request channel in a dedicated thread forkIO $ do mapM_ (atomically . writeTBMChan requestChan) [1..workloadCount] atomically $ closeTBMChan requestChan replicateM_ workloadCount $ do -- Read the result off of the response channel (workerId, int, square) <- atomically $ readTChan responseChan -- Print out a little message putStrLn $ concat [ "Worker #" , show workerId , ": square of " , show int , " is " , show square ]

Overall, this looked pretty similar to our previous channels, which isn't surprising given the relatively basic usage of concurrency going on here. However, using STM is a good default choice in Haskell applications, due to how easy it is to build up complex concurrent programs with it.

Address corner cases

Alright, we've tried mirroring how examples in other languages work, given a taste of compare-and-swap, and explored the basics of STM. Now let's make our code more robust. The examples here - and those for other languages - often take some shortcuts. For example, what happens if one of the worker threads encounters an error? When our workload is simply "square a number," that's not a concern, but with more complex workloads this is very much expected.

Our first example, as mentioned above, didn't allow for true concurrency, since it kept the channel size down to 1. And all of our examples have made one other assumption: the number of results expected. In many real-world applications, one request may result in 0, 1, or many result values. So to sum up, let's create an example with the following behavior:

  • If any of the threads involved abort exceptionally, take down the whole computation, leaving no threads alive
  • Make sure that multiple workers can work in parallel
  • Let the workers exit successfully when there are no more requests available
  • Keep printing results until all worker threads exit.

We have one final tool in our arsenal that we haven't used yet: the async library, which provides some incredibly useful concurrency tools. Arguably, the most generally useful functions there are concurrently (which runs two actions in separate threads, as we'll describe in the comments below), and mapConcurrently, which applies concurrently over a list of values.

This example is how I'd recommend implementing this algorithm in practice: it uses solid library functions, accounts for exceptions, and is easy to extend for more complicated use cases.

#!/usr/bin/env stack {- stack --install-ghc --resolver lts-6.23 runghc --package random --package async --package stm-chans -} import Control.Concurrent (threadDelay) import Control.Concurrent.Async (mapConcurrently, concurrently) import Control.Concurrent.STM (atomically) import Control.Concurrent.STM.TBMChan (readTBMChan, writeTBMChan, newTBMChan, closeTBMChan) import System.Random (randomRIO) workerCount = 250 workloadCount = 10000 minDelay = 250000 -- in microseconds, == 0.25 seconds maxDelay = 750000 -- == 0.75 seconds -- Not meaningfully changed from above, just some slight style -- tweaks. Compare and contrast with the previous version if desired -- :) worker requestChan responseChan workerId = do let loop = do delay <- randomRIO (minDelay, maxDelay) threadDelay delay mint <- atomically $ readTBMChan requestChan case mint of Nothing -> return () Just int -> do atomically $ writeTBMChan responseChan (workerId, int, int * int) loop loop main = do -- Create our communication channels. Now the response channel is -- also bounded and closable. requestChan <- atomically $ newTBMChan (workerCount * 2) responseChan <- atomically $ newTBMChan (workerCount * 2) -- We're going to have three main threads. Let's define them all -- here. Note that we're _defining_ an action to be run, not -- running it yet! We'll run them below. let -- runWorkers is going to run all of the worker threads runWorkers = do -- mapConcurrently runs each function in a separate thread -- with a different argument from the list, and then waits -- for them all to finish. If any of them throw an -- exception, all of the other threads are killed, and -- then the exception is rethrown. mapConcurrently (worker requestChan responseChan) [1..workerCount] -- Workers are all done, so close the response channel atomically $ closeTBMChan responseChan -- Fill up the request channel, exactly the same as before fillRequests = do mapM_ (atomically . writeTBMChan requestChan) [1..workloadCount] atomically $ closeTBMChan requestChan -- Print each result printResults = do -- Grab a response if available mres <- atomically $ readTBMChan responseChan case mres of -- No response available, so exit Nothing -> return () -- We got a response, so... Just (workerId, int, square) -> do -- Print it... putStrLn $ concat [ "Worker #" , show workerId , ": square of " , show int , " is " , show square ] -- And loop! printResults -- Now that we've defined our actions, we can use concurrently to -- run all of them. This works just like mapConcurrently: it forks -- a thread for each action and waits for all threads to exit -- successfully. If any thread dies with an exception, the other -- threads are killed and the exception is rethrown. runWorkers `concurrently` fillRequests `concurrently` printResults return ()

By using the high level concurrently and mapConcurrently functions, we avoid any possibility of orphaned threads, and get automatic exception handling and cancelation functionality.

Why Haskell

As you can see, Haskell offers many tools for advanced concurrency. At the most basic level, Chans and forkIO give you pretty similar behavior to what other languages provide. However, IORefs with compare-and-swap provide a cheap concurrency primitive not available in most other languages. And the combination of STM and the async package is a toolset that to my knowledge has no equal in other languages. The fact that side-effects are explicit in Haskell allows us to do many advanced feats that wouldn't be possible elsewhere.

We've only just barely scratched the surface of what you can do with Haskell. If you're interested in learning more, please check out our Haskell Syllabus for a recommended learning route. There's also lots of content on the haskell-lang get started page. And if you want to learn more about concurrency, check out the async tutorial.

FP Complete also provides corporate and group webinar training sessions. Please check out our training page for more information, or see our consulting page for how we can help your team succeed with devops and functional programming.

Contact FP Complete

Advanced questions

We skirted some more advanced topics above, but for the curious, let me address some points:

  • In our first example, we use forever to ensure that our workers would never exit. But once they had no more work to do, what happens to them? The Haskell runtime is smart enough to notice in general when a channel has no more writers, and will automatically send an asynchronous exception to a thread which is trying to read from such a channel. This works well enough for a demo, but is not recommended practice:

    1. It's possible, though unlikely, that the runtime system won't be able to figure out that your thread should be killed
    2. It's much harder to follow the logic of a program which has no explicit exit case
    3. Using exceptions for control flow is generally a risk endeavor, and in the worst case, can lead to very unexpected bugs
  • For the observant Haskeller, our definitions of runWorkers and fillRequests in the last example may look dangerous. Specifically: what happens if one of those actions throws an exception before closing the channel? The other threads reading from the channel will be blocked indefinitely! Well, three things:

    1. As just described, the runtime system will likely be able to kill the thread if needed
    2. However, because of the way we structured our program, it won't matter: if either of these actions dies, it will take down the others, so no one will end up blocked on a channel read
    3. Nonetheless, I strongly recommend being exception-safe in all cases (I'm kind of obsessed with it), so a better way to implement these functions would be with finally, e.g.:

      fillRequests = mapM_ (atomically . writeTBMChan requestChan) [1..workloadCount] `finally` atomically (closeTBMChan requestChan)
  • This post was explicitly about concurrency, or running multiple I/O actions at the same time. I avoided talking about the very much related topic of parallelism, which is speeding up a computation by performing work on multiple cores. In other languages, the distinction between these is minor. In Haskell, with our separation between purity and impurity, parallelism can often be achieved with something as simple as replacing map with parMap (parallel map).

    That said, it's certainly possible - and common - to implement parallelism via concurrency. In order to make that work, we would need to force evaluation of the result value (int * int) before writing it to the channel. This could be achieved with something like:

    let !result = int * int writeChan responseChan (workerId, result)

    The ! is called a bang pattern, and indicates that evaluation should be forced immediately.

Categories: Offsite Blogs

Michael Snoyman: Spreading the Gospel of Haskell

Planet Haskell - Mon, 11/21/2016 - 6:00pm

Yesterday I fired off two tweets about the state of Haskell evangelism:

I'd hoped by now we'd be out spreading the gospel of Haskell's awesome features. Instead, we're fighting about the same broken things.

— Michael Snoyman (@snoyberg) November 21, 2016

Haskell is by far the best language on the market today. It's so sad to see it not gaining traction because of unimportant details.

— Michael Snoyman (@snoyberg) November 21, 2016 <script async="async" charset="utf-8" src=""></script>

But simply complaining about the state of things, instead of actually proposing a way to make things better, is a waste of 280 characters. So I'd like to expand on where I think we, the Haskell community, can do better.

To try and prevent the flamewars which I'm sure are about to start, let me give some guidelines on who shouldn't read this blog post:

  • If you think that programming languages should succeed on purely technical merits, and silly "marketing activities" like writing newbie-oriented tutorials and making engaging screencasts is unfair competition, you shouldn't read this blog post.
  • If you think that broken dependency solving, Hackage downtime, confusing cabal-install incantations, and problematic global package databases in the Haskell Platform have had no ill effect on Haskell adoption, you shouldn't read this blog post.
  • If you think that new users who give up after 5 minutes of confusion on a website weren't serious enough in the first place and we shouldn't be sad that we lost them, you shouldn't read this blog post.

And most likely, you shouldn't post this to /r/haskell. That's not going to end well.

Attacking Haskell's Flaws

As the Twitter discussion yesterday pointed out, there are undoubtedly flaws in Haskell. It may be inflammatory to admit that publicly, but so be it. Every language has flaws. Haskell is blessed to also have some of the greatest strengths in any programming language available today: beautiful concurrency, a powerful and useful type system, a plethora of real-world libraries, and (as of recently) pretty good tooling and educational resources.

At FP Complete, we often talk about the attractors and obstacles (thanks to our CEO, Aaron Contorer, for this great prism to view things). Using that terminology: Haskell is chock-full of attractors. The problem is the obstacles which prevent Haskell from taking off. I'm going to claim that, at this point, we need to do very little as far as making Haskell more attractive, but instead need to collectively knock down obstacles preventing its success.

Obstacles can be a great many things, some of which you may have categorized as "missing attractors." Let me give some examples:

  • Missing IDE tooling. For some people, this is a deal-breaker, and will prevent them from using Haskell.
  • A missing library. Again, if someone needs to access, say, MS SQL Server, and a library doesn't exist, this is an obstacle to adoption. (Yes, that person could go ahead and write the library him/herself. If you think that's the right response, you probably shouldn't be reading this blog post.)
  • Lack of a tutorial/example/cookbook for a specific problem domain. Yes, someone could struggle through reading API docs until "it clicks." If that's your answer: also probably shouldn't be reading this post.
  • Lack of support for an OS/architecture.

The important thing about obstacles is that they are not universal. For most of us, lack of support for Haiku OS will not prevent us from using Haskell. Those of us who have been using Haskell for years have decided that the obstacles of bad tooling weren't enough to deter us from the great language features. And so on.


Many people in the Haskell community have been chipping away at random obstacles (or adding random attractors) for years now, on a hobbyist basis. If that's all you want to do, more power to you, and enjoy. What I'm doing here is making a call for a more concerted, organized effort into knocking down these obstacles to Haskell adoption.

I'd say that we can measure how high a priority an obstacle-destroying action is based on two criteria:

  • How difficult it will be to accomplish
  • How big an impact it will have on Haskell adoption

I would call easy actions with big impact low hanging fruit, and recommend we focus on those actions for now. In other words, while improving GHC compile times may have a big impact, it's also difficult to accomplish. Similarly, changing the Haskell logo from purple to blue is easy to accomplish, but doesn't have much impact.

So my set of easy to do and big impact things entirely come down to spreading the word. I would say our biggest and easiest knocked-down obstacles are:

  • Someone's never heard of Haskell
  • Someone's heard of Haskell, but doesn't know why it's relevant
  • Someone's well aware of Haskell, but thinks it will be hard to start with
  • Someone's already tried Haskell and run into problems (like Dependency Hell), and doesn't realize we've solved them

So what does this entail? Here are my suggestions:

  • Write a blog post about how you solved a problem in Haskell
  • Give a talk at a conference on what problems Haskell is particularly good at solving (my money goes to concurrency on this)
  • Put together a screencast on Haskell
  • Encourage a non-Haskeller to go through the Haskell Book and Haskell Syllabus

The intention here to show the world that Haskell is ready to help them, and that it's easy to get started now. Many of us at FP Complete have been putting out such posts for a while. I'm asking others to join in the fun and help give Haskell adoption a kick-start.

One final request: if you've gotten this far, odds are you agree that we need to encourage users to take the most-likely-to-succeed route to Haskell, be that with tooling, training, library installation, or library selection. We've put a lot of effort into making the destination for that goal. Hopefully can converge on this goal in the future, but for now it's very likely to just present another obstacle. When you tell people to get started with Haskell, I strongly recommend linking to

Categories: Offsite Blogs

FP Complete: Mastering Time-to-Market with Haskell

Planet Haskell - Sun, 11/20/2016 - 8:00pm

For bringing your product to market, there isn't just one metric for success. That depends on your business needs. Haskell is the pure functional programming language that brings tangible benefits over its competitors for a variety of TTM priorities. Let's explore four of them.

Speed to market

You may want to bring your product to market as quickly as possible, because you have direct competitors; you have to produce a functioning demo for your investors; or your success in the market is time sensitive. Haskell speeds up this process in various ways.

Constuct things correctly the first time: Haskell is statically typed, which is a form of program verification that guarantees correctness of certain properties, like in Java or C#. Unlike Java or C#, Haskell is a pure functional language, leading to verification of far more portions of the program source code. With feedback from the compiler while developing, your developers are guaranteed a certain level of correctness and this allows them to concentrate on your domain business logic. As written elsewhere, worst practices should be hard. Also, a case study on Haskell vs C# for contract writing.

Reduce testing time: Haskell emphasizes the correct by construction approach, which is to use the type system to verify that program parts are combined in only the ways that will not crash and that make sense. Examples range from the simple to advanced. For example, Haskell web frameworks like Yesod prevent XSS and accidental broken links statically. The more surface area of your problem covered by static analysis, the less time and effort is needed by your developers for writing unit and integration tests, and the tests blow up less often in continuous integration.

Make builds reproducible: Haskell projects that are built with the Stack build tool, are guaranteed reproducible builds using a stable set of packages, with Stack also providing docker support out of the box, adding an additional layer of reproducibility. If a reproducible build is created on one developer's machine, it will work on any. This significantly reduces ramp up time for your developers, and makes continuous integration trivial for your devops people.

Use the concurrency: Many problems are solved more easily with concurrency (networking, file I/O, video/image/document processing, database access, etc.) simply because the programming model is easier to understand. Haskell has some of the best concurrency support of any popular language, it has a breadth of tools, efficiency, stability, and it is trivial to use, out of the box. Let your developers use it. See Beautiful concurrency for more about concurrency in Haskell. Additionally, the code doesn't have to be rewritten in an arcane style like in NodeJS to gain concurrency.

Shipping on schedule

You may not need to ship as soon as possible, but to ship on schedule, for a demo, a conference or as promised to your investors or customers. For this, there is another mitigating feature of Haskell.

Types shed light on scope: Using the type system of Haskell to model your domain logic helps to expand the "fog of war" that we experience with project scope: there are many unknowns, and we need to enumerate all the cases and kinds of things. Armed with this UML-without-the-UML, you can have confidence in how much scope you can cover now for the current shipping schedule, and what needs to come in version 2. This gives confidence in time estimates made by your developers. See Haskell-Providing Peace of Mind for a case-study. See Using Haskell at SQream Technologies for a technical demonstration.

Avoid build system detours: Reproducible builds help to avoid the inevitable build system detours that happen on large projects, where developers are wasting time getting the build system to work on eachother's machines and debugging failures. Stack is reproducible out of the box.

Minimizing resources

You might want or need to be cost-effective in your development cycle, using as few developers or machines as possible. Haskell also reduces development costs.

Less testing is required: Haskell software requires less testing. Your developers always write tests, but with Haskell they can write fewer, and spend less time on it. Furthermore, fewer developers are needed to achieve a stable system, because a type system helps limit scope of possibilities, and lets your developers manage complexity.

Use Haskell's raw performance: Additionally, Haskell is a fast language on a single core. It is compiled to native machine code, and is competitive with C# and Java. At times it is competitive with C, see Haskell from C: Where are the for loops? for a technical demonstration. This means that you need fewer machine resources, and fewer machines. Haskell is also easy to write concurrent code for, which means you can make use of those additional cores on your machines.

Flexbility to make changes

Most developments put a high priority in flexibility to change, or should. But you may have particularly pronounced need for flexibility in changes to requirements without disruption. This is perhaps Haskell's strongest, most desirable advantage.

Correct by construction extends to reconstruction: Making a change to a well-typed system in Haskell is far more stable and reliable than its competitors because correctness invariants are maintained with any change of the code. See also this case study by Silk.

Less maintenance of the test suite under change: This requires less maintenance of the test suite, because static analysis gives your developers immediate feedback at a granular level, whereas a test suite typically does not guide them through the change process, it only tells them what their out-of-date specification expects. Once the change process is complete, updating the test suite becomes easier.

Types are rarely buggy: It's very rare to design a data type that has bugs in it, because they are so simple. Meanwhile a unit test or integration test suite presents additional developer overhead because it itself is a program that requires maintenance too.

See this Bump case study, for why teams are choosing Haskell over Python, and here for the comparison against Ruby of a similar nature: Haskell yields fewer errors and bugs.

Expanding development effort

You may find that your project requires hiring new developers and building a team. Now is the perfect time to hire Haskell developers. Like Python developers 10 years ago, Haskell developers are self-selecting; they learn it because it's a better language, not because it will guarantee them employment. At the same time, at this stage in Haskell's development, the wealth of practical, stable packages available indicate an infusion of pragmatic, experienced programmers.

Summary and further reading

In summary we've seen that:

  • Haskell decreases development time by narrowing the scope of development to your domain.
  • Haskell decreases the need and dependency on unit testing alone.
  • Haskell aids in reproducibility which helps teams work together, expand, and deploy.
  • Haskell's speed and easy concurrency reduce the cost of development substantially.
  • Haskell helps your developers evolve a system over time safely and with confidence, yielding more reliable time estimates.

You can learn more about using Haskell as a business at FP Complete's home page, in particular the Consulting page, or go and contact us straight away and we'll be in touch.

Categories: Offsite Blogs

Brent Yorgey: MonadRandom 0.5 and mwc-random: feedback wanted

Planet Haskell - Wed, 11/16/2016 - 5:36am

Since 2013 or so I have been the maintainer of the MonadRandom package, which provides an mtl-style type class for monads with support for generation of pseudorandom values, along with a concrete random monad transformer RandT. As of this writing it has 89 reverse dependencies on Hackage—a healthy number, and one that makes me think carefully about any breaking changes to the package.

Recently I got a number of pull requests, and have been working on putting together an 0.5 release which adds a few functions, adds lazy- and strict-state variants of RandT, and reorganizes things to be closer to standard practice of the transformers package. Since this release will include some technically breaking changes already, it’s a good time to think about potentially including others.

The one thing I am not sure what to do about is this issue: Allow MonadRandom interface for MWC-random. mwc-random is a very nice package for psuedorandom number generation, but apparently it does not fit into the MonadRandom abstraction. First of all, I would like to understand why—I am not very familiar with mwc-random. Second of all, I’d love to figure out a solution, but ideally one that causes as little breakage to existing code as possible.

Leave a comment (either here or on the github issue) if this is something you know/care about, and let’s see if we can figure out a good solution together!

Categories: Offsite Blogs

Michael Snoyman: Haskell's Missing Concurrency Basics

Planet Haskell - Tue, 11/15/2016 - 6:00pm

I want to discuss two limitations in standard Haskell libraries around concurrency, and discuss methods of improving the status quo. Overall, Haskell's concurrency story is - in my opinion - the best in class versus any other language I'm aware of, at least for the single-machine use case. The following are two issues that I run into fairly regularly and are a surprising wart:

  • putStrLn is not thread-safe
  • Channels cannot be closed

Let me back up these claims, and then ask for some feedback on how to solve them.

putStrLn is not thread-safe

The example below is, in my opinion, a prime example of beautiful concurrency in Haskell:

#!/usr/bin/env stack -- stack --resolver lts-6.23 --install-ghc runghc --package async import Control.Concurrent.Async import Control.Monad (replicateM_) worker :: Int -> IO () worker num = replicateM_ 5 $ putStrLn $ "Hi, I'm worker #" ++ show num main :: IO () main = do mapConcurrently worker [1..5] return ()

Well, it's beautiful until you see the (abridged) output:

Hi, HIiH'HH,imii , ,,I w 'IoIIm'r'' mkmmw e owrwwro ookr#rrek2kkre ee rrr# H 3#i## 4,51

Your mileage may vary of course. The issue here is that Prelude.putStrLn works on String, which is a lazy list of Chars, and in fact sends one character at a time to stdout. This is clearly not what we want. However, at the same time, many Haskellers - myself included - consider String-based I/O a bad choice anyway. So let's replace this with Text-based I/O:

#!/usr/bin/env stack -- stack --resolver lts-6.23 --install-ghc runghc --package async --package text {-# LANGUAGE OverloadedStrings #-} import Control.Concurrent.Async import Control.Monad (replicateM_) import qualified Data.Text as T import qualified Data.Text.IO as T worker :: Int -> IO () worker num = replicateM_ 5 $ T.putStrLn $ T.pack $ "Hi, I'm worker #" ++ show num main :: IO () main = do mapConcurrently worker [1..5] return ()

Unfortunately, if you run this (at least via runghc), the results are the same. If you look at the implementation of Data.Text.IO.hPutStr, you'll see that there are different implementations of that function depending on the buffering straregy of the Handle we're writing to. In the case of NoBuffering (which is the default with GHCi and runghc), this will output one character at a time (just like String), whereas LineBuffering and BlockBuffering have batch behavior. You can see this with:

#!/usr/bin/env stack -- stack --resolver lts-6.23 --install-ghc runghc --package async --package text {-# LANGUAGE OverloadedStrings #-} import Control.Concurrent.Async import Control.Monad (replicateM_) import qualified Data.Text as T import qualified Data.Text.IO as T import System.IO worker :: Int -> IO () worker num = replicateM_ 5 $ T.putStrLn $ T.pack $ "Hi, I'm worker #" ++ show num main :: IO () main = do hSetBuffering stdout LineBuffering mapConcurrently worker [1..5] return ()

While better, this still isn't perfect:

Hi, I'm worker #4Hi, I'm worker #5Hi, I'm worker #1 Hi, I'm worker #4Hi, I'm worker #5Hi, I'm worker #1 Hi, I'm worker #4Hi, I'm worker #5

Unfortunately, because newlines are written to stdout separately from the message, these kinds of issues happen too frequently. This can be worked around too by using putStr instead and manually appending a newline character:

#!/usr/bin/env stack -- stack --resolver lts-6.23 --install-ghc runghc --package async --package text {-# LANGUAGE OverloadedStrings #-} import Control.Concurrent.Async import Control.Monad (replicateM_) import qualified Data.Text as T import qualified Data.Text.IO as T import System.IO worker :: Int -> IO () worker num = replicateM_ 5 $ T.putStr $ T.pack $ "Hi, I'm worker #" ++ show num ++ "\n" main :: IO () main = do hSetBuffering stdout LineBuffering mapConcurrently worker [1..5] return ()

Finally, we can avoid the buffering-dependent code in the text package and use ByteString output, which has the advantage of automatically using this append-a-newline logic for small-ish ByteStrings:

#!/usr/bin/env stack -- stack --resolver lts-6.23 --install-ghc runghc --package async {-# LANGUAGE OverloadedStrings #-} import Control.Concurrent.Async import Control.Monad (replicateM_) import qualified Data.ByteString.Char8 as S8 worker :: Int -> IO () worker num = replicateM_ 100 $ S8.putStrLn $ S8.pack $ "Hi, I'm worker #" ++ show num main :: IO () main = do mapConcurrently worker [1..100] return ()

However, this has the downside of assuming a certain character encoding, which may be different from the encoding of the Handle.

What I'd like I would like a function Text -> IO () which - regardless of buffering strategy - appends a newline to the Text value and sends the entire chunk of data to a Handle in a thread-safe manner. Ideally it would account for character encoding (though assuming UTF8 may be an acceptable compromise for most use cases), and it would be OK if very large values are occassionally compromised during output (due to the write system call not accepting the entire chunk at once).

What I'd recommend today In a number of my smaller applications/scripts, I've become accustomed to defining a say = BS.hPutStrLn stdout . encodeUtf8. I'm tempted to add this to a library - possibly even classy-prelude - along with either reimplementing print as print = say . T.pack . show (or providing an alternative to print). I've also considered replacing the putStrLn in classy-prelude with this implementation of say.

However, I'm hoping others have some better thoughts on this, because I don't really find these solutions very appealing.

Non-closable channels

Let's implement a very simple multi-worker application with communication over a Chan:

#!/usr/bin/env stack -- stack --resolver lts-6.23 --install-ghc runghc --package async --package text {-# LANGUAGE OverloadedStrings #-} import Control.Concurrent import Control.Concurrent.Async import Control.Monad (forever) import Data.Text (Text, pack) import Data.Text.Encoding (encodeUtf8) import qualified Data.ByteString.Char8 as S8 say :: Text -> IO () say = S8.putStrLn . encodeUtf8 worker :: Chan Int -> Int -> IO () worker chan num = forever $ do i <- readChan chan say $ pack $ concat [ "Worker #" , show num , " received value " , show i ] main :: IO () main = do chan <- newChan mapConcurrently (worker chan) [1..5] `concurrently` mapM_ (writeChan chan) [1..10] return ()

(Yes, I used the aforementioned say function.)

This looks all well and good, but check out the end of the output:

Worker #5 received value 8 Worker #3 received value 9 Worker #1 received value 10 Main: thread blocked indefinitely in an MVar operation

You see, the worker threads have no way of knowing that there are no more writeChan calls incoming, so they continue to block. The runtime system notes this, and sends them an async exception to kill them. This is a really bad idea for program structure as it can easily lead to deadlocks. Said more simply:

Instead, the workers should have some way of knowing that the channel is closed. This is a common pattern in other languages, and one I think we should borrow. Implementing this with STM isn't too bad actually, and can easily have an IO-based API if desired:

#!/usr/bin/env stack -- stack --resolver lts-6.23 --install-ghc runghc --package async --package text {-# LANGUAGE OverloadedStrings #-} import Control.Applicative ((<|>)) import Control.Concurrent.Async import Control.Concurrent.STM import Data.Text (Text, pack) import Data.Text.Encoding (encodeUtf8) import qualified Data.ByteString.Char8 as S8 say :: Text -> IO () say = S8.putStrLn . encodeUtf8 data TCChan a = TCChan (TChan a) (TVar Bool) newTCChan :: IO (TCChan a) newTCChan = atomically $ TCChan <$> newTChan <*> newTVar False closeTCChan :: TCChan a -> IO () closeTCChan (TCChan _ var) = atomically $ writeTVar var True writeTCChan :: TCChan a -> a -> IO () writeTCChan (TCChan chan var) val = atomically $ do closed <- readTVar var if closed -- Could use nicer exception types, or return a Bool to -- indicate if writing failed then error "Wrote to a closed TCChan" else writeTChan chan val readTCChan :: TCChan a -> IO (Maybe a) readTCChan (TCChan chan var) = atomically $ (Just <$> readTChan chan) <|> (do closed <- readTVar var check closed return Nothing) worker :: TCChan Int -> Int -> IO () worker chan num = loop where loop = do mi <- readTCChan chan case mi of Nothing -> return () Just i -> do say $ pack $ concat [ "Worker #" , show num , " received value " , show i ] loop main :: IO () main = do chan <- newTCChan mapConcurrently (worker chan) [1..5] `concurrently` do mapM_ (writeTCChan chan) [1..10] closeTCChan chan return ()

Fortunately, this problem has a preexisting solution: the stm-chans package, which provides closable and bounded channels and queues. Our problem above can be more easily implemented with:

#!/usr/bin/env stack -- stack --resolver lts-6.23 --install-ghc runghc --package async --package text --package stm-chans {-# LANGUAGE OverloadedStrings #-} import Control.Concurrent.Async import Control.Concurrent.STM import Control.Concurrent.STM.TMQueue import Data.Text (Text, pack) import Data.Text.Encoding (encodeUtf8) import qualified Data.ByteString.Char8 as S8 say :: Text -> IO () say = S8.putStrLn . encodeUtf8 worker :: TMQueue Int -> Int -> IO () worker q num = loop where loop = do mi <- atomically $ readTMQueue q case mi of Nothing -> return () Just i -> do say $ pack $ concat [ "Worker #" , show num , " received value " , show i ] loop main :: IO () main = do q <- newTMQueueIO mapConcurrently (worker q) [1..5] `concurrently` do mapM_ (atomically . writeTMQueue q) [1..10] atomically $ closeTMQueue q return ()

What I'd like The biggest change needed here is just to get knowledge of this very awesome stm-chans package out there more. That could be with blog posts, or even better with links from the stm package itself. A step up from there could be to include this functionality in the stm package itself. Another possible niceity would be to add a non-STM API for these - whether based on STM or MVars internally - for more ease of use. I may take a first step here by simply depending on and reexporting stm-chans from classy-prelude.

What I'd recommend Probably pretty obvious: use stm-chans!

Like the previous point though, I'm interested to see how other people have approached this problem, since I haven't heard it discussed much in the past. Either others haven't run into this issue as frequently as I have, everyone already knows about stm-chans, or there's some other solution people prefer.

Categories: Offsite Blogs