# News aggregator

### Paul Johnson: What duties to software developers owe to users?

TL; DR: back in 2000 the poster, Bill Sourour, was employed to write a web questionnaire aimed at teenage girls that purported to advise the user about their need for a particular drug. In reality unless you said you were allergic to it, the questionnaire always concluded that the user needed the drug. Shortly after, Sourour read about a teenage girl who had possibly committed suicide due to side effects of this drug. He is still troubled by this.

Nothing the poster or his employer did was illegal. It may not even have been unethical, depending on exactly which set of professional ethics you subscribe to. But it seems clear to me that there is something wrong in a program that purports to provide impartial advice while actually trying to trick you into buying medication you don't need. Bill Sourour clearly agrees.

Out in meatspace we have a clearly defined set of rules for this kind of situation. Details vary between countries, but if you consult someone about legal, financial or medical matters then they are generally held to have a "fiduciary duty" to you. The term derives from the Latin for "faithful". If X has a fiduciary duty to Y, then X is bound at all times to act in the best interests of Y. In such a case X is said to be "the fiduciary" while Y is the "beneficiary".

In many cases fiduciary duties arise in clearly defined contexts and have clear bodies of law or other rules associated with them. If you are the director of a company then you have a fiduciary duty to the shareholders, and most jurisdictions have a specific law for that case. But courts can also find fiduciary duties in other circumstances. In English law the general principle is as follows:"

*A fiduciary is someone who has undertaken to act for and on behalf of another in a particular matter in circumstances which give rise to a relationship of trust and confidence.*"It seems clear to me that this describes precisely the relationship between a software developer and a user. The user is not in a position to create the program they require, so they use one developed by someone else. The program acts as directed by the developer, but on behalf of the user. The user has to trust that the program will do what it promises, and in many cases the program will have access to confidential information which could be disclosed to others against the user's wishes.

These are not theoretical concerns. "Malware" is a very common category of software, defined as:

*any software used to disrupt computer or mobile operations, gather sensitive information, gain access to private computer systems, or display unwanted advertising.*Sometimes malware is illicitly introduced by hacking, but in many cases the user is induced to run the malware by promises that it will do something that the user wants. In that case, software that acts against the interests of the user is an abuse of the trust placed in the developer by the user. In particular, the potential for software to "gather sensitive information" and "gain access to private computer systems" clearly shows that the user must have a "relationship of trust and confidence" with the developer, even if they have never met.

One argument against my thesis came up when I posted a question about this to Legal forum on Stack Exchange. The answer I got from Dale M argued that:

Engineers (including software engineers) do not have this [relationship of confidence] and AFAIK a fiduciary duty between an engineer and their client has never been found, even where the work is a one-on-one commission.I agree that, unlike a software developer, all current examples of a fiduciary duty involve a relationship in which the fiduciary is acting directly. The fiduciary has immediate knowledge of the circumstances of the particular beneficiary, and decides from moment to moment to take actions that may or may not be in the beneficiary's best interest. In contrast a software developer is separated in time from the user, and may have little or no knowledge of the user's situation.

I didn't argue with Dale M because Stack Exchange is for questions and answers, not debates. However I don't think that the distinction drawn by Dale M holds for software. An engineer designing a bridge is not in a position to learn the private information of those who cross the bridge, but a software engineer is often in a position to learn a great deal about the users of their product. It seems to me that this leads inescapably to the conclusion that software engineers do have a relationship of confidence with the user, and that this therefore creates a fiduciary duty.

Of course, as Dale M points out, nobody has ever persuaded a judge that software developers owe a fiduciary duty, and its likely that in practice its going to be a hard sell. But to go back to the example at the top, I think that Bill Sourer, or his employer, did owe a fiduciary duty to those people who ran the questionnaire software he wrote, because they disclosed private information in the expectation of getting honest advice, and the fact that they disclosed it to a program instead of a human makes no difference at all.

Addendum: Scope of duty This section looks at exactly what the scope of the fiduciary duty is. It doesn't fit within the main text of this essay, so I've put it here.

Fortunately there is no need for a change in the law regarding fiduciary duty. The existence of a fiduciary duty is based on the nature of the relationship between principal and agent, although in some countries specific cases such as company directors are covered by more detailed laws.

First it is necessary to determine exactly who the fiduciary is. So far I have talked about "the software developer", but in practice software is rarely written by a single individual. We have to look at the authority that is directing the effort and deciding what functions will be implemented. If the software is produced by a company then treating the company as the fiduciary would seem to be the best approach, although it might be more appropriate to hold a senior manager liable if they have exceeded their authority.

As for the scope, I'm going to consider the scope of the fiduciary duty imposed on company directors and consider whether an analogous duty should apply to a software developer:

- Duty of care: for directors this is the duty to inform themselves and take due thought before making a decision. One might argue that a software developer should have a similar duty of care when writing software, but this is already handled through normal negligence. Elevating the application of normal professional skill to a fiduciary duty is not going to make life better for the users. However there is one area where this might be applied: lack of motive to produce secure software is widely recognised as a significant problem, and is also an area where the "confidence" aspect of fiduciary duty overlaps with a duty of care. Therefore developers who negligently fail to consider security aspects of their software should be considered to have failed in their fiduciary duty.
- Duty of loyalty: for directors this is the duty not to use their position to further their private interests. For a software developer this is straightforward: the developer should not use their privileged access to the user's computer to further their private interests. So downloading information from the users computer (unless the user explicitly instructs this to happen) should be a breach of fiduciary duty. So would using the processing power or bandwidth owned by the user for the developers own purposes, for instance by mining bitcoins or sending spam.
- Duty of good faith: the developer should write code that will advance the user's interests and act in accordance with the user's wishes at all times.
- Duty of confidentiality: if the developer is entrusted with user information, for example because the software interfaces with cloud storage, then this should be held as confidential and not disclosed for the developer's benefit.
- Duty of prudence: This does not map onto software development.
- Duty of disclosure: for a director this providing all relevant information to the shareholders. For a software developer, it means completely and honestly documenting what the software does, and particularly drawing attention to any features which a user might reasonably consider against their interests. Merely putting some general clauses in the license is not sufficient; anything that could reasonably be considered to be contrary to the user's interests should be prominently indicated in a way that enables the user to prevent it.

### Douglas M. Auclair (geophf): November 2016 1HaskellADay Problems and Solutions

- November 29th, 2016: For today's #haskell problem we have a DATABASE (world's smallest H-table) and we aim to fit a line to a time-series. In which a simple mean-gain curve actually fits pretty well with the data in today's #haskell solution
- November 28th, 2016: Today's #haskell exercise we start to look at some real-world data, looking at gains over time. Today's #haskell solution computes gains over time ... also works for losses, yes?
- November 23rd, 2016: Today's #haskell exercise looks at time-series as a 'thing.' Today's #haskell solution charts the projection of #ingress scores over time

- November 22nd, 2016: So many activities in November: #NaNoWriMo2016 and now #ingressyear4 awards. Today's #haskell exercise takes them on. Today's #haskell solution shows I need to ingress 88k points per day; let's get crackin'!
- November 21st, 2016: In honor of it being #NaNoWriMo2016 here's today's #Haskell coach to 'help' (to nag) you write MORE, HARDER, FASTER! Today's #haskell solution shows SOMEbody they'd better get CRACKIN' to finish the #NaNoWriMo2016 writing challenge!
- November 16th, 2016: Today's #haskell exercise focuses on the important things in life: coffee and chocolate! Today's #haskell solution balances water well enough, but chocolate? That's a mite harder.
- November 15th, 2016: For today's #haskell exercise we look at simple unification on the Maybe-type. Today's #haskell solution is a little monadic/applicative logicto find that Amy is Amy. Surprise, surprise!
- November 14th, 2016: Today's #haskell exercise shows a little bit of Amy's family tree from the Mensa Genius Quiz-a-Day book
- November 11th, 2016: Today's #haskell exercise we tease out information from a Twitter API call to see who this @geophf-person really is! Somehow I got to solving two problems in one day, but OKAY, HERE WE GO! Information on a tweep from Twitter API JSON.
**BONUS:**November 11th, 2016: In honor of me solving today's problem yesterday, here's bonus #haskell problem: network of shared twitter followers. Today's #haskell solution shows that twitter follower networks can get a bit ... messy. MANY FOLLOWERS! SUCH WOW! (whispered: the shared follower network is a simple natural transformation from #categorytheory:

natx (++) twerpAsRel tweep1 tweep2

Yeah)- November 10th, 2016: Today's #haskell exercise: @1HaskellADay has followers. Tweep id 81457442 has followers. Are there shared followers? Today's #haskell solution uses natural transformation to find shared followers of 2 tweeps via the Twitter API JSON
- November 8th, 2016: We call the Twitter API directly to get a set of follower IDs to scan for today's #haskell exercise. Today's solution shows a simple (parsing) transformation from JSON to #haskell values
- November 2nd, 2016: Today's #haskell exercise explores a twitter social network along FOLLOWS-relations. BONUS lets you explore yours.
- November 1st, 2016: For today's #haskell exercise, we look at twitter users and their relations as a JSON-graph.

### Functional Jobs: Back End Functional Developer at NYU (Full-time)

**Position Summary**

The Databrary project is looking for a smart, energetic and flexible back end developer to join its technical team. The developer will act as the primary owner of the code base of our service. Working closely with the managing director and the service team, the developer will design, develop and maintain tools to enable behavioral researchers to collaborate, store, discover, explore and access video-based research datasets. (S)he will maintain an existing code base and build new features, enhancements and integrations.

Databrary (databrary.org) is the leading open source video data-sharing system for developmental science. Datavyu (datavyu.org) is the leading free, open source, multi-platform video coding tool. This position provides a unique opportunity to play a central role in advancing open science through data sharing and reuse.

The ideal candidate is a self starter who is not afraid of learning new technologies, thinks out of the box, takes initiative, has excellent attention to detail and can work to take tasks to fruition both collaboratively in a team and independently. The developer will adapt to the evolving and growing needs of the project.

**Essential Responsibilities/Functions**
**Research and evaluation**

The developer will analyze and understand current system and application architecture, logical and physical data models, security and storage implementation as well as the code base, document it thoroughly, formulate high level architectural and call graph diagrams and make recommendations to the managing director on a future strategic direction.

**Development and maintenance**

The developer will maintain existing code base and troubleshoot to improve application reliability and performance. (S)he will lead development, manage releases, deploy code, and track bug and QA progress. (S)he will build dynamic, modular and responsive web applications by implementing clean, reusable, well designed and well tested code to add enhancements, features and new integrations to the platform in current technologies (Haskell, PostgreSQL, AngularJS) or any other secure, modern, sustainable web frameworks.

**Innovation in data management**

The developer will work closely with experts in the field to understand the complete data lifecycle and management for researchers. (S)he will advocate for and become a force of innovation at each step of activities undertaken in relation to the collection, processing, description, transformation, retention and reuse of research data. (S)he will design, develop, implement, test and validate existing and new data management and web-based tools to facilitate research.

**Preferred Skills, Knowledge and Abilities**

Hands-on experience with functional languages like Haskell, OCaml, F#, or Scala.

Knowledge of modern web frameworks in high-level languages such as Java, Ruby, Python or PHP and video technologies.

Knowledge of JavaScript, JS frameworks, HTML, CSS and other front end technologies.

Understanding of best practices in SDLC (Software Development Life Cycle).

Understanding of TDD (Test-driven development), security and design patterns.

Experience with version control, unix scripting, automation and DevOps practices.

Familiarity using CRM, project management and task management systems.

Passion for open source projects and building high quality software.

Strong written and oral communication skills.

Superior listening and analytical skills and a knack for tackling tough problems.

Ability to multitask and juggle multiple priorities and projects.

Adaptability and openness to learn and change.

**Required Experience**

Track record of designing scalable software for web applications in modern web frameworks.

Exceptional understanding of system architecture, object oriented principles, web technologies, REST API and MVC patterns.

Solid knowledge of SQL and RDBMS like PostgreSQL.

Basic knowledge of scientific practices and research tools, such as Matlab, SPSS, or R.

**Preferred Education**

- BS, MS or Ph.D in Computer Science, Information Technology or other relevant field.

*New York University is an Equal Opportunity Employer. New York University is committed to a policy of equal treatment and opportunity in every aspect of its hiring and promotion process without regard to race, color, creed, religion, sex, pregnancy or childbirth (or related medical condition), sexual orientation, partnership status, gender and/or gender identity or expression, marital or parental status, national origin, ethnicity, alienage or citizenship status, veteran or military status, age, disability, predisposing genetic characteristics, domestic violence victim status, unemployment status, or any other legally protected basis. Women, racial and ethnic minorities, persons of minority sexual orientation or gender identity, individuals with disabilities, and veterans are encouraged to apply for vacant positions at all levels.*

Get information on how to apply for this position.

### FP Complete: Do you like Scala? Give Haskell a try!

The language Scala promises a smooth migration path from Object-oriented Java to functional programming. It runs on the JVM, has concepts both from OOP and FP, and an emphasis on interoperability with Java.

As a multi-paradigm language, Scala can flatten the learning curve for those already familiar with OOP, at the price of making some compromises in language design that can make functional programming in Scala more cumbersome than it has to be.

If you like the functional aspects of Scala, but are somewhat bothered by the annoyances that doing FP in Scala incurs, you should take a look at Haskell! It is one of the languages that inspired Scala, so you will find many familiar concepts. By fully embracing the functional paradigm, Haskell makes functional programming much more convenient. The following is an (incomplete) overview of some of the things you can expect from Haskell.

If you want to try out Haskell right away, you can find instructions for getting started here. We have collected some useful training resources at FP Complete Haskell Syllabus, and also provide commercial training. If you're planning to use Haskell for a project and want some help, we can provide you with Consulting services.

Concise syntaxScala uses a Java-like syntax, but avoids a lot of unnecessary boilerplate by making things like an end-of-line semicolon or empty code blocks { } optional.

Haskell goes much further in terms of conciseness of syntax. For instance, function application, the most common operation in a functional language, is indicated by whitespace, writing f x instead of f(x).

As an example of Haskell's conciseness, consider the definition of a binary tree containing data of some type A. In Scala, you'd define a trait, and a case object/case class each for an empty node and a node that contains an A, and left and right sub-trees, like this:

sealed trait Tree[+A] case object Empty extends Tree[Nothing] case class Node[A](v: A, l: Tree[A], r: Tree[A]) extends Tree[A]The Haskell equivalent is

data Tree a = Empty | Node a (Tree a) (Tree a)Another thing that is syntactically more pleasant in Haskell is currying: in Scala, if you want to use a curried function, you have to anticipate this when defining the function, and write

def curriedF(x: Int)(y: Int) = ...instead of

def f(x: Int, y: Int) = ...and then use an underscore to get a one-argument function, as in curriedF(x)_. In Haskell, you can curry any function by applying it to a subset of its arguments; if you define a function of two arguments, as in

f x y = ...then f x is a function of one argument.

Function composition, i.e., combining two functions f :: b -> c and g :: a -> b to a function from a to c, is simply indicated by a dot, as in f . g :: a -> c. With function composition and currying, it is possible to combine small functions with minimal visual overhead. For example,

tenSmallest = take 10 . sortdefines a function that will first sort a list, and then return the first 10 elements.

The minimalistic syntax of Haskell both avoids excessive typing, and makes it easier to read code, since there is less distracting boilerplate.

More powerful type systemBoth Scala and Haskell are statically typed languages, but Haskell's type system is arguably more powerful, in the sense that it provides more guarantees that are checked by the compiler. It also has superior type inference, which means that it places a lower burden on the programmer.

Zero-cost newtypesOne of the benefits of using a type system is that it prevents you from making mistakes such as passing an argument to a function that makes no sense. The more fine-grained the type system is, the more mistakes it can preclude.

For instance, you might have many values of type Double in your program, where some may stand for lengths as measured in meters, some for times in seconds, etc. If the type system only knows that those are Doubles, it will not get in your way when you do a silly mistake, such as adding a time and a distance.

In Scala, you can wrap a Double up in a *value class* that
represents a time interval as in,

This type is essentially a Double, but Scala will not let you mix up values of type Seconds and Double, eliminating a class of possible errors.

Scala tries to avoid instantiation of value classes, using the underlying type as the runtime representation whenever possible. However, since the JVM does not support values classes, under some circumstances (such as when performing pattern matches or storing a value class in an array), instantiation is unavoidable and there is a run-time cost to the abstraction.

If we want to perform operations, such as addition or subtraction, on the new type Seconds, we'll have to define operations on them, such as

class Seconds(val value: Double) extends AnyVal { def +(x: Seconds): Seconds = new Seconds(value + x.value) def -(x: Seconds): Seconds = new Seconds(value - x.value) }While this is straightforward, it is boilerplate code that you'd like to avoid writing yourself.

In Haskell, you can: defining Seconds as a newtype, as in

newtype Seconds = Seconds Double deriving (Eq, Num, Ord)will define a new type, Seconds, that has numerical operations,
comparisons, and equality defined. Furthermore, the run-time
representation of this will *always* be a plain Double, so there is
no cost for this abstraction.

Since newtypes are so lightweight (both in terms of performance and in the amount of boilerplate code), they are used pervasively in Haskell projects, which contributes to clean, robust code.

Track side effects in the type systemA prominent example is the IO type: in Haskell, the type signature

f :: a -> bindicates a function f that takes a value of type a, and returns a value of type b, without producing any side effects. This is distinct from

g :: a -> IO bwhich says that g takes input of type a, gives output of type b, but also can also perform some IO in between.

Just by looking at the type signature, a programmer can conclude that f does not interact with the file system, the network, or the user. This means that the function f is referentially transparent, i.e., an occurrence of f in the code is equivalent to its result.

In Scala, writing referentially transparent functions is encouraged as
a best practice, but it is not evident, without reading the actual
code, whether a given function has side effects or not. In Haskell,
referential transparency is *enforced* by the type system, which not
only gives important information to the programmer, but also allows
for optimizations in the form of rewrite rules.

As an example, consider the consecutive application of two functions f :: a -> b and g :: b -> c to all the elements of a list, via map. You can either write this as

map g . map for as

map (g . f)For referentially transparent functions, both versions give the same result, but the first version involves two list traversals, and the second only one. Since referential transparency is evident in the types, GHC can safely use a rewrite rule to replace the first version with the second.

These rewrite rules are not hard-wired into the compiler, but are a feature that is exposed to the programmer, so that when you implement your own data structures, you can define your own rewrite rules to boost efficiency. A prominent example is the vector package, which uses re-write rules to performs stream fusion, giving c-like performance to high level code.

No conflicting implicit conversionsScala has the notion of *implicit conversions*, that let you add
functionality to your types. For instance, declaring an implicit
conversion to the Ordered trait lets you add functions for comparing
values with respect to some ordering. Implicit conversions are a
powerful tool, but also one that can lead to subtle errors. Since
implicits can be brought to scope anywhere, you have no guarantee of
consistency.

For example, when you use implicit conversions to store and retrieve data in a sorted set, there is no guarantee that the implicit conversion used for storage and retrieval is the same.

In Haskell, you would instead write an instance of the Ord typeclass for your data type. Since writing multiple instances of a given type class for the same data type is a compile-time error, consistency is guaranteed.

Superior type inferenceOne argument brought against static type systems is that writing down
type signatures for every function can be tedious. But modern
languages like Scala and Haskell feature algorithms that let the
compiler *infer* the type of a function when none is stated
explicitly.

Scala, being a multi-paradigm language, has to account for inheritance, which makes type inference more challenging. As a result, in many cases Scala cannot infer types that are much easier to infer in Haskell.

More advanced featuresIf you miss features like existential or universal quantification, type families, or generalized algebraic data types, Haskell has those as well.

With many researchers amongst its users, Haskell gets new extensions to its type system regularly. One notable example is Liquid Haskell, which lets you encode restrictions to your types, and pre- and postconditions for your functions, and have their correctness proven at compile time, eliminating the need for run-time checks.

Proper tail call optimizationFunctional programming relies heavily on the use of recursion where imperative languages would use loops. In order not to blow the stack, tail call optimization is employed.

Unfortunately, due to limitations in the JVM, Scala only has fairly limited tail call optimization, in that it only optimizes self-tail calls. This excludes the case of multiple functions making tail calls to each other (which flatMap does), severely limiting the usefulness of functional programming in Scala.

There are workarounds for this, such as using trampolines, but they come with performance costs. Thus, when programming in Scala, you often face the question whether it is ok to using functional programming techniques for some problem, or if the price you pay in terms of performance would be too high.

Haskell does not have this problem, since all tail calls are optimized.

Language interoperabilityA strong selling point of Scala is its interoperability with Java, allowing the re-use of a plethora of existing libraries.

Haskell provides a lot in terms of language interoperability as well:

- The Foreign Function Interface (FFI) allows calling C functions from Haskell code. It is very easy to use, and allows interoperation with all languages that also have C interoperability (such as C++, Python, and Ruby). A lot of popular C libraries have Haskell bindings using the FFI.
- Furthermore, it is possible to embed snippets of C code directly into your Haskell code, via inline-c.
- For data analysis and visualization, you can embed R code with inline-r.
- With the recent addition of inline-java, it is straightforward to call Java functions from Haskell (although Haskell will not share a heap with Java, as Scala does).

With its origins in academia, Haskell gets a reputation of being a language purely for the Ivory Tower. Despite this preconception, Haskell is being used successfully in real world projects: Facebook built its spam filter with Haskell, the Commercial Haskell special interest group lists companies using Haskell, and our own success stories show some examples of successful applications of Haskell in industry.

Get StartedIf you want to try out Haskell right away, you can find instructions for getting started here. If you want to learn more, have a look at the FP Complete Haskell Syllabus, or look at our Haskell training.

If you're planning to use Haskell for a project and want some help, we can provide you with Consulting services.

### FP Complete: Devops best practices: Multifaceted Testing

Even among skilled enterprise IT departments, it is too rare that software is thoroughly tested before deployment. Failed deployments mean costly downtime, service failures, upset users, and even security breaches. How can we verify that a solution is actually ready to deploy, free of serious defects?

Contact us for professional assistance

You probably need more kinds of testsWe’ve all seen “ready to deploy” applications that do not work as expected once deployed. Often it’s because the production system is not in fact identical to the staging system, so the testing wasn’t valid. This can be prevented with another devops best practice, automated deployments -- which we talked about in this recent post and will return to again.

But often the problem is that the software, even on a properly configured test
system and staging system, *was never fully tested.* Before an app is approved
for deployment, your QA system (mostly automated) should complete:

- Success tests
- Failure tests
- Corner-case tests
- Randomized tests (also called mutation tests, fuzz tests)
- Load and performance tests
- Usability tests
- Security tests

(If you were not using automated, reproducible deployments, you would also have
to do explicit pre-tests of the deployment script itself -- but you *are* doing
fully automated deployments, right? If you aren’t, consider moving to the
sorts of tools we use in FP Deploy -- like Docker and Kubernetes, Puppet and
Ansible.)

In the rest of this article we’ll see how each kind of testing adds something different and important.

Success testing: run with realistic inputsMost operations teams won’t accept a deployment from the engineering group
(dev or test or QA) unless the system has *at least* passed success testing.
When presented with correct inputs, does the system generate correct outputs and
not crash?

It’s the most basic testing. Yet it’s often left incomplete. To avoid serious omissions, use at least this checklist:

Did you play back a realistic workload, based on a real sample of typical inputs? If not, your test’s idea of what to try may be very different from what your users will actually do in the first day in production. Huge source of real-world failures. Did you test through both the UI and the API? Automated testing only through the UI may mask problems that the UI prevents or corrects. Automated testing only through the API can’t find pure UI bugs. Were your tests updated when the spec was updated? For that matter, was the spec even updated when the software’s intended function was updated? If “no” on either of these, you are not actually testing the software’s correct function. Do you have coverage? Has someone gone back through the spec and identified all the promised functionality, and verified that some test actually tests each item -- and that it was run on the latest build, and passed? Did you test in a system that’s configured the same as your production system? Especially if deployments are not automated, it is crucial to ensure that nothing was done in the staging system to make it easier to pass than the production system. Common examples: omitting firewalls that would be present between layers in production; running test services under accounts with excessive permissions; having a manual security checklist for production deployments that is not used on staging deployments; sequencing inputs so that multiple simulated users cannot appear concurrently. Passing tests may not mean much if the environment is rigged to ensure success.

Believe it or not, that was the easy part. For enterprise production quality, you still want to test your system six more ways. Jumping right in to number two...

Failure testing: when you break the law, do you go to jail?Your testing is all under an automated set of continuous integration (CI) scripts, right? (If not, time to look into that.) But do you have tests that force all of the specified error conditions to occur? Do you pass in every identified kind of prohibited/invalid input? Do you also create all the realistic external error conditions, like a network link failure, a timeout, a full disk, low memory (with a tool like Chaos Monkey)?

A test suite doesn’t check whether specified error conditions actually generate the right errors is not complete. We recommend that the QA team present a report showing a list of all existing tests, what conditions they purport to test, and when they were last run and passed.

Corner-case testing: try something crazyMaybe you’ve heard the joke: a QA engineer walks into a bar, and orders a beer, and 2 beers, and 20 beers, and 0 beers, and a million beers, and -1 beers. And a duck.

Corner-case testing means success testing using *unrealistic but legal* inputs.
Often, developers write code that works correctly in typical cases, but fails in
the extremes.

Before deploying, consider: are any of your users going to try anything crazy?
What would be the oddest things still permitted, and what happens if you try?
Who tested that and verified that the output was correct? Correct code works on
*all* permitted inputs, not just *average* ones. This is a fast way to find bugs
before deployment -- push the system right to the edge of what it should be able
to do.

Corner cases vary by application, but here are some typical examples to spark your thinking. Where strings are permitted, what happens if they are in a very differently structured language, like Chinese or Arabic? What happens if they are extremely long? Where numbers are permitted, what happens if they are very large, very small, zero, negative? And why is the permitted range as big as it is; should it be reduced? Is it legal to request output of a billion records, and what happens if I do? Where options are permitted, what if someone chooses all of them, or a bizarre mixture? Can I order a pizza with 30 toppings? Can I prescribe 50 medicines, at 50 bottles each, for a sample patient? What happens if nested or structured inputs are extremely complex? Can I send an email with 100 embeddings, each of which is an email with 100 embeddings?

If an application hasn’t been tested with ridiculous-but-legal inputs, no one really knows if it’s going to hold up in production.

Randomized testing: never saw*that*before!

No human team can test every possible combination of cases and actions. And that
may be okay, because many projects find more bugs per unit of effort through
*randomized* generation of test cases than any other way.

This means writing scripts that start with well-understood inputs, and then letting them make random, arbitrary changes to these inputs and run again, then change and run again, many thousands of times. Even if it’s not realistic to test the outputs for correctness (because the script may be unable to tell what its crazy inputs were supposed to do), the outputs can be tested for structural validity -- and the system can be watched for not crashing, and not generating any admin alerts or unhandled errors or side effects.

It’s downright surprising how fast you can find bugs in a typical unsafe language (like Python or C or Java) through simple mutation testing. Extremely safe languages like Haskell tend to find these bugs at compile time, but it may still be worth trying some randomized testing. Remember, machine time is cheap; holes in deployed code are very expensive.

Load and performance testing: now that's heavyCompanies with good devops say heavy load is one of the top remaining sources of failure. The app works for a while, but fails when peak user workload hits. Be on the lookout for conditions that could overload your servers, and make sure someone is forcing them to happen on the staging system -- before they happen in production.

Consider whether your test and staging systems are similar enough to your production system. If your production system accepts 5000 requests per second on 10 big machines, and your test system accepts 5 per second on one tiny VM, how will you know about database capacity issues or network problems?

A good practice is to throw enormous, concurrent, simulated load at your test system that (1) exceeds any observed real-world load and (2) includes a wide mix of realistic inputs, perhaps a stream of historic real captured inputs as well as random ones. This reduces the chance that you threw a softball at the system when real users are going to throw a hardball.

Performance testing can include sending faster and faster inputs until some hardware resource becomes saturated. (You may enjoy watching system monitor screens as this is happening!) Find the bottleneck -- what resource can you expect to fail first in production? How will you prevent it? Do your deployment scripts specify an abundance of this resource? Have you implemented cloud auto-scaling so that new parallel servers are fired up when the typically scarce resource (CPU, RAM, network link, …) gets too busy?

Usability testing: it doesn't work if people can't use itMost people consider this to be outside the realm of devops. Who cares if users find your system confusing and hard to use? Well, lots of people, but why should a devops person care?

What will happen to your production environment if a new feature is deployed and suddenly user confusion goes through the roof? Will they think there’s a bug? Will support calls double in an hour, and stay doubled? Will you be forced to do a rollback?

User interface design probably isn’t your job. But if you are deploying user-facing software that has not been usability tested, you’re going to hear about it. Encourage your colleagues to do real testing of their UI on realistic, uninitiated users (not just team members who know what the feature is supposed to do), or at least skeptical test staff who know how to try naive things on purpose, before declaring a new feature ready to deploy.

Security testing: before it's too lateOne of the worst things you can do is to cause a major security breach, leading to a loss of trust and exposure of users’ private data.

Testing a major, public-facing, multi-server device (a distributed app) for security is a big topic, and I’d be doing a disservice by trying to summarize it in just a couple of paragraphs. We’ll return in future posts to both the verification and testing side, and the design and implementation side, of security. A best practice is to push quality requirements upstream, letting developers know that security is their concern too, and ensuring that integration-test systems use a secure automated deployment similar or identical to the production system. Don’t let developers say “I assume you’ll secure this later.”

Meanwhile, as a devops best practice, your deployment and operations team should have at least one identified security expert, a person whose job includes knowing about all the latest security test tools and ensuring that they are being used where appropriate. Are you checking for XSS attacks and SQL injection attacks? DDoS attacks? Port-scan attacks? Misconfigured default accounts? It’s easy to neglect security until it’s too late, so make someone responsible.

Security holes can appear in application code, or in the platform itself (operating system, library packages, and middleware). At a minimum, security testing should include running standard off-the-shelf automated scanning software that looks for known ways to intrude, most often taking advantage of poor default configurations, or of platform components that have not been upgraded to the latest patch level. Run an automated security scan before moving a substantially changed system from staging into production. Automated testing is almost free, in stark contrast to the costly manual clean-up after a breach.

ConclusionsWow, that’s a lot of testing! More than a lot of companies actually do. Yet it’s quite hard to look back through that list and find something that’s okay to omit. Clearly the devops pipeline doesn’t begin at the moment of deployment, but sooner, in the development team itself. That means developers and testers taking responsibility for delivering a quality product that’s actually ready to deploy.

As usual, systems thinking wins. A good operations team doesn’t say “give us what you’ve got, and we’ll somehow get it online.” A good operations team says “we offer deployment and operation services, and here’s how you can give us software that will deploy successfully into our production environment.”

We hope you’ll keep reading our blog and making use of what we have learned. Thanks for spending time with FP Complete.

### Ken T Takusagawa: [xxneozpu] Demonstration of Data.Numbers.Fixed

Here is a little demonstration of arbitrary precision floating-point (actually fixed-point) arithmetic in Haskell, using Data.Numbers.Fixed in the numbers package. Using dynamicEps, we calculate sqrt(1/x) to arbitrary precision, outputting the number in a user-specified base.

Performance is not so great; the package implements Floating functions using elegant but not necessarily high performance algorithms based on continued fractions (Gosper, HAKMEM). The hmpfr package might be better. (For square root, it might also be easy to roll one's own implementation of Newton's method on Rational.)

Radix conversion of a fractional number is implemented as an unfold, similar to radix conversion of an integer.

### Michael Snoyman: Haskell Documentation, 2016 Update

I've blogged, Tweeted, and conversed about Haskell documentation quite a bit in the past. Following up on tooling issues, all available evidence tells me that improving the situation for documentation in Haskell is the next obstacle we need to knock down.

This blog post will cover:

- Where I think the biggest value is to be had in improving Haskell documentation
- Status of various initiatives I've been involved in (and boy do I walk away red-faced from this)
- Recommendations for how others can best contribute
- A basic idea of what actions I - and others at FP Complete - have been taking

In my opinion, the sore spot for Haskell overall is *intermediate*
docs. (Yes, that's vague, bear with me momentarily.) I'm going to
posit that:

- Beginners are currently well served by introductory Haskell books, and most recently by Haskell Programming from First Principles
- Once you have a solid basis in intermediate concepts, it's much easier to jump into libraries and academic papers and understand what's going on

To me, intermediate means you already know the basics of Haskell syntax, monads, and common typeclasses, but aren't really familiar with any non-base libraries, concurrency, or exception handling. The goal of intermediate documentation is to:

- Teach
*which*libraries to use,*when*to use them, and*how*to use them - Give a guide on structuring Haskell programs
- Educate on important techniques, including those mentioned above, as well as issues around lazy evaluation and other common stumbling blocks

Many of us who have learned Haskell over the past many years have probably picked up these topics sporadically. While some people will want to plow ahead in that kind of haphazard approach, my belief is that the vast majority of users want to be more guided through the process. We'll get to the Haskell Syllabus and opinionated vs unopinionated content a bit later.

Previous effortsIt turns out, as I'm quite embarassed to admit, that I've essentially tried reinventing the same intermediate docs concept multiple times, first with MezzoHaskell, and then with the Commercial Haskell doc initiative. You may also include School of Haskell in that list too, but I'm going to treat it separately.

These initiatives never took off. A pessimistic view is that Haskellers are simply uninterested in contributing to such a shared body of intermediate-level docs. I actually believed that for a bit, but recent activity has convinced me otherwise. I think these previous initiatives failed due to an unsatisfactory user experience. These initiatives required people to go to an infrequently used Github repo to view docs, which no one was doing. A few months back, a new option presented itself.

haskell-lang's documentationFor those who haven't seen it, you should check out the libraries page and documentation page on the haskell-lang.org site. I believe this hits the nail on the head in many different ways:

- It directly addresses a common user question: which are the best libraries to use for a certain task? The libraries page gives this answer (though it can certainly be improved and expanded!)
- We're able to curate the documentation as a community. Providing a list of recommended documents on this site gives a reader more confidence than simply Googling and hoping the author knows what he/she is talking about
- The collaboration is done via pull requests on markdown files. I've discussed previously why I think this is a far better collaboration technique than Wikis or other options.
- Instead of requiring all docs live within the haskell-lang.org repository, documents can be embedded from elsewhere. For example, I've written a conduit tutorial in the conduit repository, and embedded its content on haskell-lang.org via a simple inclusion mechanism. This allows authors to maintain their documentation individually, but provide users with a central location to find these kinds of documents. (I'd encourage other sites to take advantage of this transclusion technique, getting quality content into user hands is the goal!)

haskell-lang tries to host only "uncontroversial" documentation. Documents explaining how to use a library are pretty straightforward. Recommending libraries like bytestring, text, and vector are all pretty well accepted. And for cases where multiple libraries are used, we link to both.

I've merged all of the content I wrote in MezzoHaskell and the
Commercial Haskell doc initiative into haskell-lang.org where it
fit. However, there was still some more controversial content left,
such as
exceptions best practices,
which I know many people disagree with me about. Also, I'd like to be
able to tell a user looking for a solution, "yes, there are multiple
libraries around, I recommend **X**." Neither of these belong on a
community site like haskell-lang, so for those...

This is where alternative sites thrive. Since I'm collaborating with others at FP Complete on this, and actively using this in training courses, I've put together a Haskell Syllabus page page. This is where I'll tell someone "you should do X, even though others disagree with me." I won't enumerate the contentious decisions here (odds are someone else will ultimately make such a list on my behalf).

And if you disagree with this? Write a new syllabus! I think it would be a healthy thing if we could get to the point as a community where we could have multiple recommended, opinionated syllabuses, and link to them with a short description for each one. This may sound at odds with some of my previous statements, so let me clarify:

- When there's an obviously best choice, tell the user to use it
- When most users will be best with one choice, and another option is available, mention it as a footnote
- When multiple options are available and there's no way to know which the user will want, break down and give them all the information they need. But...
- Try to make that happen as infrequently - and as late in the learning process - as possible! If we could have a "you've completed Beginner Haskell, please choose between one of the following," and explain the difference between "FP Complete's course" vs (for example) "lens-first Haskell", that would be a good trade-off.

My thoughts on this are still evolving, and will likely change in the future as I get more feedback from users.

Writing styleAnother big change I've made over the years is writing style. I wrote the Yesod book in a very prose-heavy manner, focusing on explaining details with words, and using concise, to-the-point code examples. Many users have given me feedback to push me in a different direction. Instead, I've recently been writing trying to write in a very different, and thankfully easier to write, style:

- Short explanation of what the thing is I'm talking about and when you'd use it
- Synopsis: medium sized code snippet to give a flavor (I used this in the Yesd book too, and stole the idea straight from Perl docs)
- A series of increasingly complex examples, with the bare minimum amount of content around it to explain what's going on

I'd put this style into a hybrid of tutorial and cookbook, and think it works well overall. I've only heard positives so far versus previous styles, so that's encouraging. Some examples:

- The http-client tutorial fits this well, since there are no deep concepts to explain
- conduit has a bit more explanation, but is still mostly code examples
- At the other end, Primitive Haskell and Covariance and Contravariance and much more concept-heavy, and therefore still have a significant chunk of prose

I'm taking this approach because I think it's what most users want. Some important points:

- Not all users are the same! There will almost certainly be users who would prefer a different style of docs. Given enough user feedback and manpower to write docs, it would be great to cater to all tastes, but it's best right now to focus on the highest demand
- API docs are still necessary, and are completely orthogonal to tutorials. A tutorial doesn't document each API call, an API-call-level explanation doesn't give enough breadth, and certainly users need more than just the type signatures.

After all of that, my recommendation on how to get involved is pretty simple:

- Pick a library that doesn't have a good tutorial
- Write a tutorial
- Submit a PR to haskell-lang to include the content
- Alternatively: get the Markdown file included in the project's repo instead, and include the remote file instead

haskell-lang has a nifty feature. If you visit https://haskell-lang.org/library/vector, it will display the vector documentation it has. But if you visit a package like https://haskell-lang.org/library/stm which doesn't (yet) have a tutorial on haskell-lang, it will automatically redirect you to the Stackage package page. When giving out links to people on the internet, I recommend using the haskell-lang.org/library/XXX link.

- When a tutorial is available, the haskell-lang page is great
- When a tutorial isn't available, the doc building on Stackage is still the most reliable around
- In addition, Stackage docs properly link to docs built in the same Stackage snapshot, making the cross linking more reliable
- When present, Stackage gives preference to the README.md files in a package, which are generally more useful than the description fields.

I'd be remiss in not mentioning School of Haskell here. As far as I'm concerned, School of Haskell is great platform for an individual to write content without any collaboration. However, for all of the cases I'm describing here, some kind of easy collaboration (via pull requests) is a huge win. Opening things up more with pull requests, README.md files, and embedding content into multiple external sites seems like the best option today (until someone comes up with something better!).

### Manuel M T Chakravarty: Here is the video of my talk “A Type is Worth a Thousand...

Here is the video of my talk “A Type is Worth a Thousand Tests” presented at Sydney CocoaHeads, November 2016 (you can also get the slides) — I previously presented this talk at YOW! Connected, Melbourne, and an earlier version at Curry On in Rome.

In this talk, I argue that types are a design tool that, if applied correctly, reduces the need for tests. I illustrate this at the example of the design of a simple iPhone app in Swift whose source code is available on GitHub.

### Joe Armstrong Interviews Alan Kay

Youtube video (via HN)

By far not the best presentation of Kay's ideas but surely a must watch for fans. Otherwise, skip until the last third of the interview which might add to what most people here already know.

It is interesting that in this talk Kay rather explicitly talks about programming languages as abstraction layers. He also mentions some specifics that may not be as well known as others, yet played a role in his trajectory, such as META.

I fully sympathize with his irritation with the lack of attention to and appreciation of fundamental concepts and theoretical frameworks in CS. On the other hand, I find his allusions to biology unconvincing.

An oh, he is certainly right about Minsky's book (my first introduction to theoretical CS) and in his deep appreciation of John McCarthy.

### Derek Elkins: Category Theory, Syntactically

This will be a *very* non-traditional introduction to the ideas behind category theory. It will essentially be a slice through model theory (presented in a more programmer-friendly manner) with an unusual organization. Of course the end result will be *****SPOILER ALERT***** it was category theory all along. A secret decoder ring will be provided at the end. This approach is inspired by the notion of an internal logic/language and by Vaughn Pratt’s paper The Yoneda Lemma Without Category Theory.

I want to be very clear, though. This is not meant to be an analogy or an example or a guide for intuition. This *is* category theory. It is simply presented in a different manner.

The first concept we’ll need is that of a theory. If you’ve ever implemented an interpreter for even the simplest language, than most of what follows modulo some terminological differences should be both familiar and very basic. If you are familiar with algebraic semantics, then that is exactly what is happening here only restricting to unary (but multi-sorted) algebraic theories.

For us, a **theory**, #ccT#, is a collection of **sorts**, a collection of (unary) **function symbols**1, and a collection of **equations**. Each function symbol has an input sort and an output sort which we’ll call the **source** and **target** of the function symbol. We’ll write #ttf : A -> B# to say that #ttf# is a function symbol with source #A# and target #B#. We define #"src"(ttf) -= A# and #"tgt"(ttf) -= B#. Sorts and function symbols are just symbols. Something is a sort if it is in the collection of sorts. Nothing else is required. A function symbol is not a function, it’s just a, possibly structured, name. Later, we’ll map those names to functions, but the same name may be mapped to different functions. In programming terms, a theory defines an interface or signature. We’ll write #bb "sort"(ccT)# for the collection of sorts of #ccT# and #bb "fun"(ccT)# for the collection of function symbols.

A (raw) **term** in a theory is either a **variable** labelled by a sort, #bbx_A#, or it’s a function symbol applied to a term, #tt "f"(t)#, such that the sort of the term #t# matches the source of #ttf#. The sort or target of a term is the sort of the variable if it’s a variable or the target of the outermost function symbol. The source of a term is the sort of the innermost variable. In fact, all terms are just sequences of function symbol applications to a variable, so there will always be exactly one variable. All this is to say the expressions need to be “well-typed” in the obvious way. Given a theory with two function symbols #ttf : A -> B# and #ttg : B -> A#, #bbx_A#, #bbx_B# , #tt "f"(bbx_A)#, and #tt "f"(tt "g"(tt "f"(bbx_A)))# are all examples of terms. #tt "f"(bbx_B)# and #tt "f"(tt "f"(bbx_A))# are not terms because they are not “well-typed”, and #ttf# by itself is not a term simply because it doesn’t match the syntax. Using Haskell syntax, we can define a data type representing this syntax if we ignore the sorting:

Using GADTs, we could capture the sorting constraints as well:

data Term (s :: Sort) (t :: Sort) where Var :: Term t t Apply :: FunctionSymbol x t -> Term s x -> Term s tAn important operation on terms is **substitution**. Given a term #t_1# with source #A# and a term #t_2# with target #A# we define the substitution of #t_2# into #t_1#, written #t_1[bbx_A |-> t_2]#, as:

If #t_1 = bbx_A# then #bbx_A[bbx_A |-> t_2] -= t_2#.

If #t_1 = tt "f"(t)# then #tt "f"(t)[bbx_A |-> t_2] -= tt "f"(t[bbx_A |-> t_2])#.

Using the theory from before, we have:

#tt "f"(bbx_A)[bbx_A |-> tt "g"(bbx_B)] = tt "f"(tt "g"(bbx_B))#

As a shorthand, for arbitrary terms #t_1# and #t_2#, #t_1(t_2)# will mean #t_1[bbx_("src"(t_1)) |-> t_2]#.

Finally, equations2. An **equation** is a pair of terms with equal source and target, for example, #(: tt "f"(tt "g"(bbx_B)), bbx_B :)#. The idea is that we want to identify these two terms. To do this we quotient the set of terms by the congruence generated by these pairs, i.e. by the reflexive-, symmetric-, transitive-closure of the relation generated by the equations which further satisfies “if #s_1 ~~ t_1# and #s_2 ~~ t_2# then #s_1(s_2) ~~ t_1(t_2)#”. From now on, by “terms” I’ll mean this quotient with “raw terms” referring to the unquotiented version. This means that when we say “#tt "f"(tt "g"(bbx_B)) = bbx_B#”, we really mean the two terms are congruent with respect to the congruence generated by the equations. We’ll write #ccT(A, B)# for the collection of terms, in this sense, with source #A# and target #B#. To make things look a little bit more normal, I’ll write #s ~~ t# as a synonym for #(: s, t :)# when the intent is that the pair represents a given equation.

Expanding the theory from before, we get the **theory of isomorphisms**, #ccT_{:~=:}#, consisting of two sorts, #A# and #B#, two function symbols, #ttf# and #ttg#, and two equations #tt "f"(tt "g"(bbx_B)) ~~ bbx_B# and #tt "g"(tt "f"(bbx_A)) ~~ bbx_A#. The equations lead to equalities like #tt "f"(tt "g"(tt "f"(bbx_A))) = tt "f"(bbx_A)#. In fact, it doesn’t take much work to show that this theory only has four distinct terms: #bbx_A#, #bbx_B#, #tt "f"(bbx_A)#, and #tt "g"(bbx_B)#.

In traditional model theory or universal algebra we tend to focus on multi-ary operations, i.e. function symbols that can take multiple inputs. By restricting ourselves to only unary function symbols, we expose a duality. For every theory #ccT#, we have the **opposite theory**, #ccT^(op)# defined by using the same sorts and function symbols but swapping the source and target of the function symbols which also requires rewriting the terms in the equations. The rewriting on terms is the obvious thing, e.g. if #ttf : A -> B#, #ttg : B -> C#, and #tth : C -> D#, then the term in #ccT#, #tt "h"(tt "g"(tt "f"(bbx_A)))# would become the term #tt "f"(tt "g"(tt "h"(bbx_D)))# in #ccT^(op)#. From this it should be clear that #(ccT^(op))^(op) = ccT#.

Given two theories #ccT_1# and #ccT_2# we can form a new theory #ccT_1 xx ccT_2# called the **product theory** of #ccT_1# and #ccT_2#. The sorts of this theory are pairs of sorts from #ccT_1# and #ccT_2#. The collection of function symbols is the *disjoint union* #bb "fun"(ccT_1) xx bb "sort"(ccT_2) + bb "sort"(ccT_1) xx bb "fun"(ccT_2)#. A disjoint union is like Haskell’s Either type. Here we’ll write #tt "inl"# and #tt "inr"# for the left and right injections respectively. #tt "inl"# takes a function symbol from #ccT_1# and a sort from #ccT_2# and produces a function symbol of #ccT_1 xx ccT_2# and similarly for #tt "inr"#. If #tt "f" : A -> B# in #ccT_1# and #C# is a sort of #ccT_2#, then #tt "inl"(f, C) : (A, C) -> (B, C)# and similarly for #tt "inr"#.

The collection of equations for #ccT_1 xx ccT_2# consists of the following:

- for every equation, #l ~~ r# of #ccT_1# and every sort, #C#, of #ccT_2# we produce an equation #l’ ~~ r’# by replacing each function symbol #ttf# in #l# and #r# with #tt "inl"(tt "f", C)#
- similarly for equations of #ccT_2#
- for every pair of function symbols #ttf : A -> B# from #ccT_1# and #ttg : C -> D# from #ccT_2#, we produce the equation #tt "inl"(tt "f", D)(tt "inr"(A, tt "g")(bbx_{:(A, C")":})) ~~ tt "inr"(B, tt "g")("inl"(tt "f", C)(bbx_{:(A, C")":}))#

The above is probably unreadable. If you work through it, you can show that every term of #ccT_1 xx ccT_2# is equivalent to a pair of terms #(t_1, t_2)# where #t_1# is a term in #ccT_1# and #t_2# is a term in #ccT_2#. Using this equivalence, the first bullet is seen to be saying that if #l = r# in #ccT_1# and #C# is a sort in #ccT_2# then #(l, bbx_C) = (r, bbx_C)# in #ccT_1 xx ccT_2#. The second is similar. The third then states

#(t_1, bbx_C)((bbx_A, t_2)(bbx_{:"(A, C)":})) = (t_1, t_2)(bbx_{:"(A, C)":}) = (bbx_A, t_2)((t_1, bbx_C)(bbx_{:"(A, C)":}))#.

To establish the equivalence between terms of #ccT_1 xx ccT_2# and pairs of terms from #ccT_1# and #ccT_2#, we use the third bullet to move all the #tt "inl"#s outward at which point we’ll have a sequence of #ccT_1# function symbols followed by a sequence of #ccT_2# function symbols each corresponding to term.

The above might seem a bit round about. An alternative approach would be to define the function symbols of #ccT_1 xx ccT_2# to be all pairs of all the *terms* from #ccT_1# and #ccT_2#. The problem with this approach is that it leads to an *explosion* in the number of function symbols and equations required. In particular, it easily produces an infinitude of function symbols and equations even when provided with theories that only have a finite number of sorts, function symbols, and equations.

As a concrete and useful example, consider the theory #ccT_bbbN# consisting of a single sort, #0#, a single function symbol, #tts#, and no equations. This theory has a term for each natural number, #n#, corresponding to #n# applications of #tts#. Now let’s articulate #ccT_bbbN xx ccT_bbbN#. It has one sort, #(0, 0)#, two function symbols, #tt "inl"(tt "s", 0)# and #tt "inr"(0, tt "s")#, and it has one equation, #tt "inl"(tt "s", 0)(tt "inr"(0, tt "s")(bbx_{:(0, 0")":})) ~~ tt "inr"(0, tt "s")("inl"(tt "s", 0)(bbx_{:(0, 0")":}))#. Unsurprisingly, the terms of this theory correspond to pairs of natural numbers. If we had used the alternative definition, we’d have had an infinite number of function symbols and an infinite number of equations.

Nevertheless, for clarity I will typically write a term of a product theory as a pair of terms.

As a relatively easy exercise — easier than the above — you can formulate and define the disjoint sum of two theories #ccT_1 + ccT_2#. The idea is that every term of #ccT_1 + ccT_2# corresponds to either a term of #ccT_1# or a term of #ccT_2#. Don’t forget to define what happens to the equations.

Related to these, we have the theory #ccT_{:bb1:}#, which consists of one sort and no function symbols or equations, and #ccT_{:bb0:}# which consists of no sorts and thus no possibility for function symbols or equations. #ccT_{:bb1:}# has exactly one term while #ccT_{:bb0:}# has no terms.

CollagesSometimes we’d like to talk about function symbols whose source is in one theory and target is in another. As a simple example, that we’ll explore in more depth later, we may want function symbols whose sources are in a product theory. This would let us consider terms with multiple inputs.

The natural way to achieve this is to simply make a new theory that contains sorts from both theories plus the new function symbols. A **collage**, #ccK#, from a theory #ccT_1# to #ccT_2#, written #ccK : ccT_1 ↛ ccT_2#, is a theory whose collection of sorts is the disjoint union of the sorts of #ccT_1# and #ccT_2#. The function symbols of #ccK# consist for each function symbol #ttf : A -> B# in #ccT_1#, a function symbol #tt "inl"(ttf) : tt "inl"(A) -> tt "inl"(B)#, and similarly for function symbols from #ccT_2#. Equations from #ccT_1# and #ccT_2# are likewise taken and lifted appropriately, i.e. #ttf# is replaced with #tt "inl"(ttf)# or #tt "inr"(ttf)# as appropriate. Additional function symbols of the form #k : tt "inl"(A) -> tt "inr"(Z)# where #A# is a sort of #ccT_1# and #Z# is a sort of #ccT_2#, and potentially additional equations involving these function symbols, may be given. (If no additional function symobls are given, then this is exactly the disjoint sum of #ccT_1# and #ccT_2#.) These additional function symbols and equations are what differentiate two collages that have the same source and target theories. Note, there are *no* function symbols #tt "inr"(Z) -> tt "inl"(A)#, i.e. where #Z# is in #ccT_2# and #A# is in #ccT_1#. That is, there are no function symbols going the “other way”. To avoid clutter, I’ll typically assume that the sorts and function symbols of #ccT_1# and #ccT_2# are disjoint already, and dispense with the #tt "inl"#s and #tt "inr"#s.

Summarizing, we have #ccK(tt "inl"(A), "inl"(B)) ~= ccT_1(A, B)#, #ccK(tt "inr"(Y), tt "inr"(Z)) ~= ccT_2(Y, Z)#, and #ccK(tt "inr"(Z), tt "inl"(A)) = O/# for all #A#, #B#, #Y#, and #Z#. #ccK(tt "inl"(A), tt "inr"(Z))# for any #A# and #Z# is arbitrary generated. To distinguish them, I’ll call the function symbols that go from one theory to another **bridges**. More generally, an arbitrary term that has it’s source in one theory and target in another will be described as a **bridging term**.

Here’s a somewhat silly example. Consider #ccK_+ : ccT_bbbN xx ccT_bbbN ↛ ccT_bbbN# that has one bridge #tt "add" : (0, 0) -> 0# with the equations #tt "add"(tt "inl"(tts, 0)(bbx_("("0, 0")"))) ~~ tts(tt "add"(bbx_("("0, 0")")))# and #tt "add"(tt "inr"(0, tts)(bbx_("("0, 0")"))) ~~ tts(tt "add"(bbx_("("0, 0")")))#.

More usefully, if a bit degenerately, every theory induces a collage in the following way. Given a theory #ccT#, we can build the collage #ccK_ccT : ccT ↛ ccT# where the bridges consist of the following. For each sort, #A#, of #ccT#, we have the following bridge: #tt "id"_A : tt "inl"(A) -> tt "inr"(A)#. Then, for every function symbol, #ttf : A -> B# in #ccT#, we have the following equation: #tt "inl"(tt "f")(tt "id"_A(bbx_(tt "inl"(A)))) ~~ tt "id"_B(tt "inr"(tt "f")(bbx_(tt "inl"(A))))#. We have #ccK_ccT(tt "inl"(A), tt "inr"(B)) ~= ccT(A, B)#.

You can think of a bridging term in a collage as a sequence of function symbols partitioned into two parts by a bridge. Naturally, we might consider partitioning into more than two parts by having more than one bridge. It’s easy to generalize the definition of collage to combine an arbitrary number of theories, but I’ll take a different, probably less good, route. Given collages #ccK_1 : ccT_1 ↛ ccT_2# and #ccK_2 : ccT_2 ↛ ccT_3#, we can make the collage #ccK_2 @ ccK_1 : ccT_1 ↛ ccT_3# by defining its bridges to be triples of a bridge of #ccK_1#, #k_1 : A_1 -> A_2#, a term, #t : A_2 -> B_2# of #ccT_2#, and a bridge of #ccK_2#, #k_2 : B_2 -> B_3# which altogether will be a bridge of #ccK_2 @ ccK_1# going from #A_1 -> B_3#. These triples essentially represent a term like #k_2(t(k_1(bbx_(A_1))))#. With this intuition we can formulate the equations. For each equation #t'(k_1(t_1)) ~~ s'(k'_1(s_1))# where #k_1# and #k'_1# are bridges of #ccK_1#, we have for every bridge #k_2# of #ccK_2# and term #t# of the appropriate sorts #(k_2, t(t'(bbx)), k_1)(t_1) ~~ (k_2, t(s'(bbx)), k'_1)(s_1)# and similarly for equations involving the bridges of #ccK_2#.

This composition is associative… almost. Furthermore, the collages generated by theories, #ccK_ccT#, behave like identities to this composition… almost. It turns out these statements are true, but only up to isomorphism of theories. That is, #(ccK_3 @ ccK_2) @ ccK_1 ~= ccK_3 @ (ccK_2 @ ccK_1)# but is not equal.

To talk about isomorphism of theories we need the notion of…

InterpretationsAn interpretation of a theory gives meaning to the syntax of a theory. There are two nearly identical notions of interpretation for us: interpretation (into sets) and interpretation into a theory. I’ll define them in parallel. An **interpretation** (**into a theory**), #ccI#, is a mapping, written #⟦-⟧^ccI# though the superscript will often be omitted, which maps sorts to sets (sorts) and function symbols to functions (terms). The mapping satisfies:

#⟦"src"(f)⟧ = "src"(⟦f⟧)# and #⟦"tgt"(f)⟧ = "tgt"(⟦f⟧)# where #"src"# and #"tgt"# on the right are the domain and codomain operations for an interpretation.

We extend the mapping to a mapping on terms via:

- #⟦bbx_A⟧ = x |-> x#, i.e. the identity function, or, for interpretation into a theory, #⟦bbx_A⟧ = bbx_{:⟦A⟧:}#
- #⟦tt "f"(t)⟧ = ⟦tt "f"⟧ @ ⟦t⟧# or, for interpretation into a theory, #⟦tt "f"(t)⟧ = ⟦tt "f"⟧(⟦t⟧)#

and we require that for any equation of the theory, #l ~~ r#, #⟦l⟧ = ⟦r⟧#. (Technically, this is implicitly required for the extension of the mapping to terms to be well-defined, but it’s clearer to state it explicitly.) I’ll write #ccI : ccT -> bb "Set"# when #ccI# is an interpretation of #ccT# into sets, and #ccI’ : ccT_1 -> ccT_2# when #ccI’# is an interpretation of #ccT_1# into #ccT_2#.

An interpretation of the theory of isomorphisms produces a bijection between two specified sets. Spelling out a simple example where #bbbB# is the set of booleans:

- #⟦A⟧ -= bbbB#
- #⟦B⟧ -= bbbB#
- #⟦tt "f"⟧ -= x |-> not x#
- #⟦tt "g"⟧ -= x |-> not x#

plus the proof #not not x = x#.

As another simple example, we can interpret the theory of isomorphisms into itself slightly non-trivially.

- #⟦A⟧ -= B#
- #⟦B⟧ -= A#
- #⟦tt "f"⟧ -= tt "g"(bbx_B)#
- #⟦tt "g"⟧ -= tt "f"(bbx_A)#

As an (easy) exercise, you should define #pi_1 : ccT_1 xx ccT_2 -> ccT_1# and similarly #pi_2#. If you defined #ccT_1 + ccT_2# before, you should define #iota_1 : ccT_1 -> ccT_1 + ccT_2# and similarly for #iota_2#. As another easy exercise, show that an interpretation of #ccT_{:~=:}# is a bijection. In Haskell, an interpretation of #ccT_bbbN# would effectively be foldNat. Something very interesting happens when you consider what an interpretation of the collage generated by a theory, #ccK_ccT#, is. Spell it out. In a different vein, you can show that a collage #ccK : ccT_1 ↛ ccT_2# and an interpretation #ccT_1^(op) xx ccT_2 -> bb "Set"# are essentially the same thing in the sense that each gives rise to the other.

Two theories are **isomorphic** if there exists interpretations #ccI_1 : ccT_1 -> ccT_2# and #ccI_2 : ccT_2 -> ccT_1# such that #⟦⟦A⟧^(ccI_1)⟧^(ccI_2) = A# and visa versa, and similarly for function symbols. In other words, each is interpretable in the other, and if you go from one interpretation and then back, you end up where you started. Yet another way to say this is that there is a one-to-one correspondence between sorts and terms of each theory, and this correspondence respects substitution.

As a crucially important example, the set of terms, #ccT(A, B)#, can be extended to an interpretation. In particular, for each sort #A#, #ccT(A, -) : ccT -> bb "Set"#. It’s action on function symbols is the following:

#⟦tt "f"⟧^(ccT(A, -)) -= t |-> tt "f"(t)#

We have, dually, #ccT(-, A) : ccT^(op) -> bb "Set"# with the following action:

#⟦tt "f"⟧^(ccT(-, A)) -= t |-> t(tt "f"(bbx_B))#

We can abstract from both parameters making #ccT(-, =) : ccT^(op) xx ccT -> bb "Set"# which, by an early exercise, can be shown to correspond with the collage #ccK_ccT#.

Via an abuse of notation, I’ll identify #ccT^(op)(A, -)# with #ccT(-, A)#, though technically we only have an isomorphism between the interpretations, and to talk about isomorphisms between interpretations we need the notion of…

HomomorphismsThe theories we’ve presented are (multi-sorted) universal algebra theories. Universal algebra allows us to specify a general notion of “homomorphism” that generalizes monoid homomorphism or group homomorphism or ring homomorphism or lattice homomorphism.

In universal algebra, the algebraic theory of groups consists of a single sort, a nullary operation, #1#, a binary operation, #*#, a unary operation, #tt "inv"#, and some equations which are unimportant for us. Operations correspond to our function symbols except that they’re are not restricted to being unary. A particular group is a particular interpretation of the algebraic theory of groups, i.e. it is a set and three functions into the set. A group homomorphism then is a function between those two groups, i.e. between the two interpretations, that preserves the operations. In a traditional presentation this would look like the following:

Say #alpha : G -> K# is a group homomorphism from the group #G# to the group #K# and #g, h in G# then:

- #alpha(1_G) = 1_K#
- #alpha(g *_G h) = alpha(g) *_K alpha(h)#
- #alpha(tt "inv"_G(g)) = tt "inv"_K(alpha(g))#

Using something more akin to our notation, it would look like:

- #alpha(⟦1⟧^G) = ⟦1⟧^K#
- #alpha(⟦*⟧^G(g,h)) = ⟦*⟧^K(alpha(g), alpha(h))#
- #alpha(⟦tt "inv"⟧^G(g)) = ⟦tt "inv"⟧^K(alpha(g))#

The #tt "inv"# case is the most relevant for us as it is unary. However, for us, a function symbol #ttf# may have a different source and target and so we made need a different function on each side of the equation. E.g. for #ttf : A -> B#, #alpha : ccI_1 -> ccI_2#, and #a in ⟦A⟧^(ccI_1)# we’d have:

#alpha_B(⟦tt "f"⟧^(ccI_1)(a)) = ⟦tt "f"⟧^(ccI_2)(alpha_A(a))#

So a **homomorphism** #alpha : ccI_1 -> ccI_2 : ccT -> bb "Set"# is a *family* of functions, one for each sort of #ccT#, that satisfies the above equation for every function symbol of #ccT#. We call the individual functions making up #alpha# **components** of #alpha#, and we have #alpha_A : ⟦A⟧^(ccI_1) -> ⟦A⟧^(ccI_2)#. The definition for an interpretation into a theory, #ccT_2#, is identical except the components of #alpha# are terms of #ccT_2# and #a# can be replaced with #bbx_(⟦A⟧^(ccI_1))#. Two interpretations are **isomorphic** if we have homomorphism #alpha : ccI_1 -> ccI_2# such that each component is a bijection. This is the same as requiring a homomorphism #beta : ccI_2 -> ccI_1# such that for each #A#, #alpha_A(beta_A(x)) = x# and #beta_A(alpha_A(x)) = x#. A similar statement can be made for interpretations into theories, just replace #x# with #bbx_(⟦A⟧)#.

Another way to look at homomorphisms is via collages. A homomorphism #alpha : ccI_1 -> ccI_2 : ccT -> bb "Set"# gives rise to an interpretation of the collage #ccK_ccT#. The interpretation #ccI_alpha : ccK_ccT -> bb "Set"# is defined by:

- #⟦tt "inl"(A)⟧^(ccI_alpha) -= ⟦A⟧^(ccI_1)#
- #⟦tt "inr"(A)⟧^(ccI_alpha) -= ⟦A⟧^(ccI_2)#
- #⟦tt "inl"(ttf)⟧^(ccI_alpha) -= ⟦ttf⟧^(ccI_1)#
- #⟦tt "inr"(ttf)⟧^(ccI_alpha) -= ⟦ttf⟧^(ccI_2)#
- #⟦tt "id"_A⟧^(ccI_alpha) -= alpha_A#

The homomorphism law guarantees that it satisfies the equation on #tt "id"#. Conversely, given an interpretation of #ccK_ccT#, we have the homomorphism, #⟦tt "id"⟧ : ⟦tt "inl"(-)⟧ -> ⟦tt "inr"(-)⟧ : ccT -> bb "Set"#. and the equation on #tt "id"# is exactly the homomorphism law.

YonedaConsider a homomorphism #alpha : ccT(A, -) -> ccI#. The #alpha# needs to satisfy for every sort #B# and #C#, every function symbol #ttf : C -> D#, and every term #t : B -> C#:

#alpha_D(tt "f"(t)) = ⟦tt "f"⟧^ccI(alpha_C(t))#

Looking at this equation, the possibility of viewing it as a recursive “definition” leaps out suggesting that the action of #alpha# is completely determined by it’s action on the variables. Something like this, for example:

#alpha_D(tt "f"(tt "g"(tt "h"(bbx_A)))) = ⟦tt "f"⟧(alpha_C(tt "g"(tt "h"(bbx_A)))) = ⟦tt "f"⟧(⟦tt "g"⟧(alpha_B(tt "h"(bbx_A)))) = ⟦tt "f"⟧(⟦tt "g"⟧(⟦tt "h"⟧(alpha_A(bbx_A))))#

We can easily establish that there’s a one-to-one correspondence between the set of homomorphisms #ccT(A, -) -> ccI# and the elements of the set #⟦A⟧^ccI#. Given a homomorphism, #alpha#, we get an element of #⟦A⟧^ccI# via #alpha_A(bbx_A)#. Inversely, given an element #a in ⟦A⟧^ccI#, we can define a homomorphism #a^**# via:

- #a_D^**(tt "f"(t)) -= ⟦tt "f"⟧^ccI(a_C^**(t))#
- #a_A^**(bbx_A) -= a#

which clearly satisfies the condition on homomorphisms by definition. It’s easy to verify that #(alpha_A(bbx_A))^** = alpha# and immediately true that #a^**(bbx_A) = a# establishing the bijection.

We can state something stronger. Given any homomorphism #alpha : ccT(A, -) -> ccI# and any function symbol #ttg : A -> X#, we can make a new homomorphism #alpha * ttg : ccT(X, -) -> ccI# via the following definition:

#(alpha * ttg)(t) = alpha(t(tt "g"(bbx_A)))#

Verifying that this is a homomorphism is straightforward:

#(alpha * ttg)(tt "f"(t)) = alpha(tt "f"(t(tt "g"(bbx_A)))) = ⟦tt "f"⟧(alpha(t(tt "g"(bbx_A)))) = ⟦tt "f"⟧((alpha * ttg)(t))#

and like any homomorphism of this form, as we’ve just established, it is completely determined by it’s action on variables, namely #(alpha * ttg)_A(bbx_A) = alpha_X(tt "g"(bbx_A)) = ⟦tt "g"⟧(alpha_A(bbx_A))#. In particular, if #alpha = a^**#, then we have #a^** * ttg = (⟦tt "g"⟧(a))^**#. Together these facts establish that we have an interpretation #ccY : ccT -> bb "Set"# such that #⟦A⟧^ccY -= (ccT(A, -) -> ccI)#, the *set* of homomorphisms, and #⟦tt "g"⟧^ccY(alpha) -= alpha * tt "g"#. The work we did before established that we have homomorphisms #(-)(bbx) : ccY -> ccI# and #(-)^** : ccI -> ccY# that are inverses. This is true for all theories and all interpretations as at no point did we use any particular facts about them. This statement is the (dual form of the) Yoneda lemma. To get the usual form simply replace #ccT# with #ccT^(op)#. A particularly important and useful case (so useful it’s usually used tacitly) occurs when we choose #ccI = ccT(B,-)#, we get #(ccT(A, -) -> ccT(B, -)) ~= ccT(B, A)# or, choosing #ccT^(op)# everywhere, #(ccT(-, A) -> ccT(-, B)) ~= ccT(A, B)# which states that a term from #A# to #B# is equivalent to a homomorphism from #ccT(-, A)# to #ccT(-, B)#.

There is another result, dual in a different way, called the co-Yoneda lemma. It turns out it is a corollary of the fact that for a collage #ccK : ccT_1 ↛ ccT_2#, #ccK_(ccT_2) @ ccK ~= ccK# and the dual is just the composition the other way. To get (closer to) the precise result, we need to be able to turn an interpretation into a collage. Given an interpretation, #ccI : ccT -> bb "Set"#, we can define a collage #ccK_ccI : ccT_bb1 ↛ ccT# whose bridges from #1 -> A# are the elements of #⟦A⟧^ccI#. Given this, the co-Yoneda lemma is the special case, #ccK_ccT @ ccK_ccI ~= ccK_ccI#.

Note, that the Yoneda and co-Yoneda lemmas only apply to interpretations into sets as #ccY# involves the *set* of homomorphisms.

The Yoneda lemma suggests that the interpretations #ccT(A, -)# and #ccT(-, A)# are particularly important and this will be borne out as we continue.

We call an interpretation, #ccI : ccT^(op) -> bb "Set"# **representable** if #ccI ~= ccT(-, X)# for some sort #X#. We then say that #X# **represents** #ccI#. What this states is that every term of sort #X# corresponds to an element in one of the sets that make up #ccI#, and these transform appropriately. There’s clearly a particularly important element, namely the image of #bbx_X# which corresponds to an element in #⟦X⟧^ccI#. This element is called the **universal element**. The dual concept is, for #ccI : ccT -> bb "Set"#, #ccI# is **co-representable** if #ccI ~= ccT(X, -)#. We will also say #X# represents #ccI# in this case as it actually does when we view #ccI# as an interpretation of #(ccT^(op))^(op)#.

As a rather liberating exercise, you should establish the following result called **parameterized representability**. Assume we have theories #ccT_1# and #ccT_2#, and a family of sorts of #ccT_2#, #X#, and a family of interpretations of #ccT_2^(op)#, #ccI#, both indexed by sorts of #ccT_1#, such that for each #A in bb "sort"(ccT_1)#, #X_A# represents #ccI_A#, i.e. #ccI_A ~= ccT_2(-, X_A)#. Given all this, then there is a *unique* interpretation #ccX : ccT_1 -> ccT_2# and #ccI : ccT_1 xx ccT_2^(op) -> bb "Set"# where #⟦A⟧^(ccX) -= X_A# and #"⟦("A, B")⟧"^ccI -= ⟦B⟧^(ccI_A)# such that #ccI ~= ccT_2(=,⟦-⟧^ccX)#. To be a bit more clear, the right hand side means #(A, B) |-> ccT_2(B, ⟦A⟧^ccX)#. Simply by choosing #ccT_1# to be a product of multiple theories, we can generalize this result to an arbitrary number of parameters. What makes this result liberating is that we just don’t need to worry about the parameters, they will automatically transform homomorphically. As a technical warning though, since two interpretations may have the same action on sorts but a different action on function symbols, if the family #X_A# was derived from an interpretation #ccJ#, i.e. #X_A -= ⟦A⟧^ccJ#, it may *not* be the case that #ccX = ccJ#.

Let’s look at some examples.

As a not-so-special case of representability, we can consider #ccI -= ccK(tt "inl"(-), tt "inr"(Z))# where #ccK : ccT_1 ↛ ccT_2#. Saying that #A# represents #ccI# in this case is saying that *bridging* terms of sort #tt "inr"(Z)#, i.e. sort #Z# in #ccT_2#, in #ccK#, correspond to terms of sort #A# in #ccT_1#. We’ll call the universal element of this representation the **universal bridge** (though technically it may be a bridging term, not a bridge). Let’s write #varepsilon# for this universal bridge. What representability states in this case is given *any* bridging term #k# of sort #Z#, there exists a unique term #|~ k ~|# of sort #A# such that #k = varepsilon(|~ k ~|)#. If we have an interpretation #ccX : ccT_2 -> ccT_1# such that #⟦Z⟧^ccX# represents #ccK(tt "inl"(-), tt "inr"(Z))# for each sort #Z# of #ccT_2# we say we have a **right representation** of #ccK#. Note, that the universal bridges become a family #varepsilon_Z : ⟦Z⟧^ccX -> Z#. Similarly, if #ccK(tt "inl"(A), tt "inr"(-))# is co-representable for each #A#, we say we have a **left representation** of #ccK#. The **co-universal bridge** is then a bridging term #eta_A : A -> ⟦A⟧# such that for any bridging term #k# with source #A#, there exists a unique term #|__ k __|# in #ccT_2# such that #k = |__ k __|(eta_A)#. For reference, we’ll call these equations **universal properties** of the left/right representation. Parameterized representability implies that a left/right representation is essentially unique.

Define #ccI_bb1# via #⟦A⟧^(ccI_bb1) -= bb1# where #bb1# is some one element set. #⟦ttf⟧^(ccI_bb1)# is the identity function for all function symbols #ttf#. We’ll say a theory #ccT# **has a unit sort** or **has a terminal sort** if there is a sort that we’ll also call #bb1# that represents #ccI_bb1#. Spelling out what that means, we first note that there is nothing notable about the universal element as it’s the only element. However, writing the homomorphism #! : ccI_bb1 -> ccT(-, bb1)# and noting that since there’s only one element of #⟦A⟧^(ccI_bb1)# we can, with a slight abuse of notation, also write the term #!# picks out as #!# which gives the equation:

#!_B(tt "g"(t)) = !_A(t)# for any function symbol #ttg : A -> B# and term, #t#, of sort #A#, note #!_A : A -> bb1#.

This equation states what the isomorphism also fairly directly states: there is exactly one term of sort #bb1# from any sort #A#, namely #!_A(bbx_A)#. The dual notion is called a **void sort** or an **initial sort** and will usually be notated #bb0#, the analog of #!# will be written as #0#. The resulting equation is:

#tt "f"(0_A) = 0_B# for any function symbol #ttf : A -> B#, note #0_A : bb0 -> A#.

For the next example, I’ll leverage collages. Consider the collage #ccK_2 : ccT ↛ ccT xx ccT# whose bridges from #A -> (B, C)# consist of pairs of terms #t_1 : A -> B# and #t_2 : A -> C#. #ccT# **has pairs** if #ccK_2# has a right representation. We’ll write #(B, C) |-> B xx C# for the representing interpretation’s action on sorts. We’ll write the universal bridge as #(tt "fst"(bbx_(B xx C)), tt "snd"(bbx_(B xx C)))#. The universal property then looks like #(tt "fst"(bbx_(B xx C)), tt "snd"(bbx_(B xx C)))((: t_1, t_2 :)) = (t_1, t_2)# where #(: t_1, t_2 :) : A -> B xx C# is the unique term induced by the bridge #(t_1, t_2)#. The universal property implies the following equations:

- #(: tt "fst"(bbx_(B xx C)), tt "snd"(bbx_(B xx C))) = bbx_(B xx C)#
- #tt "fst"((: t_1, t_2 :)) = t_1#
- #tt "snd"((: t_1, t_2 :)) = t_2#

One aspect of note, is regardless of whether #ccK_2# has a right representation, i.e. regardless of whether #ccT# has pairs, it always has a left representation. The co-universal bridge is #(bbx_A, bbx_A)# and the unique term #|__(t_1, t_2)__|# is #tt "inl"(t_1, bbx_A)(tt "inr"(bbx_A, t_2)(bbx_("("A,A")")))#.

Define an interpretation #Delta : ccT -> ccT xx ccT# so that #⟦A⟧^Delta -= (A,A)# and similarly for function symbols. #Delta# left represents #ccK_2#. If the interpretation #(B,C) |-> B xx C# right represents #ccK_2#, then we say we have an adjunction between #Delta# and #(- xx =)#, written #Delta ⊣ (- xx =)#, and that #Delta# is left adjoint to #(- xx =)#, and conversely #(- xx =)# is right adjoint #Delta#.

More generally, whenever we have the situation #ccT_1(⟦-⟧^(ccI_1), =) ~= ccT_2(-, ⟦=⟧^(ccI_2))# we say that #ccI_1 : ccT_2 -> ccT_1# is **left adjoint** to #ccI_2 : ccT_1 -> ccT_2# or conversely that #ccI_2# is **right adjoint** to #ccI_1#. We call this arrangement an **adjunction** and write #ccI_1 ⊣ ccI_2#. Note that we will always have this situation if #ccI_1# left represents and #ccI_2# right represents the same collage. As we noted above, parameterized representability actually determines one adjoint given (its action on sorts and) the other adjoint. With this we can show that adjoints are unique up to isomorphism, that is, given two left adjoints to an interpretation, they must be isomorphic. Similarly for right adjoints. This means that stating something is a left or right adjoint to some other known interpretation essentially completely characterizes it. One issue with adjunctions is that they tend to be wholesale. Let’s say the pair sort #A xx B# existed but no other pair sorts existed, then the (no longer parameterized) representability approach would work just fine, but the adjunction would no longer exist.

Here’s a few of exercises using this. First, a moderately challenging one (until you catch the pattern): spell out the details to the left adjoint to #Delta#. We say a theory **has sums** and write those sums as #A + B# if #(- + =) ⊣ Delta#. Recast void and unit sorts using adjunctions and/or left/right representations. As a terminological note, we say a theory **has finite products** if it has unit sorts and pairs. Similarly, a theory **has finite sums** or **has finite coproducts** if it has void sorts and sums. An even more challenging exercise is the following: a theory **has exponentials** if it has finite products and for every sort #A#, #(A xx -) ⊣ (A => -)# (note, parameterized representability applies to #A#). Spell out the equations characterizing #A => B#.

Finite products start to lift us off the ground. So far the theories we’ve been working with have been extremely basic: a language with only unary functions, all terms being just a sequence of applications of function symbols. It shouldn’t be underestimated though. It’s more than enough to do monoid and group theory. A good amount of graph theory can be done with just this. And obviously we were able to establish several general results assuming only this structure. Nevertheless, while we can talk about specific groups, say, we can’t talk about the *theory* of groups. Finite products change this.

A theory with finite products allows us to talk about multi-ary function symbols and terms by considering unary function symbols from products. This allows us to do all of universal algebra. For example, the **theory of groups**, #ccT_(bb "Grp")#, consists of a sort #S# and all it’s products which we’ll abbreviate as #S^n# with #S^0 -= bb1# and #S^(n+1) -= S xx S^n#. It has three function symbols #tte : bb1 -> S#, #ttm : S^2 -> S#, and #tti : S -> S# plus the ones that having finite products requires. In fact, instead of just heaping an infinite number of sorts and function symbols into our theory — and we haven’t even gotten to equations — let’s define a compact set of data from which we can generate all this data.

A **signature**, #Sigma#, consists of a collection of sorts, #sigma#, a collection of *multi-ary* function symbols, and a collection of equations. Equations still remain pairs of terms, but we need to now extend our definition of terms for this context. A **term (in a signature)** is either a variable, #bbx_i^[A_0,A_1,...,A_n]# with #A_i# are sorts and #0 <= i <= n#, the operators #tt "fst"# or #tt "snd"# applied to a term, the unit term written #(::)^A# with sort #A#, a pair of terms written #(: t_1, t_2 :)#, or the (arity correct) application of a multi-ary function symbol to a series of terms, e.g. #tt "f"(t_1, t_2, t_3)#. As a Haskell data declaration, it might look like:

At this point, sorting (i.e. typing) the terms is no longer trivial, though it is still pretty straightforward. Sorts are either #bb1#, or #A xx B# for sorts #A# and #B#, or a sort #A in sigma#. The source of function symbols or terms are *lists* of sorts.

- #bbx_i^[A_0, A_1, ..., A_n] : [A_0, A_1, ..., A_n] -> A_i#
- #(::)^A : [A] -> bb1#
- #(: t_1, t_2 :) : bar S -> T_1 xx T_2# where #t_i : bar S -> T_i#
- #tt "fst"(t) : bar S -> T_1# where #t : bar S -> T_1 xx T_2#
- #tt "snd"(t) : bar S -> T_2# where #t : bar S -> T_1 xx T_2#
- #tt "f"(t_1, ..., t_n) : bar S -> T# where #t_i : bar S -> T_i# and #ttf : [T_1,...,T_n] -> T#

The point of a signature was to represent a theory so we can compile a term of a signature into a term of a theory with finite products. The **theory generated from a signature** #Sigma# has the same sorts as #Sigma#. The equations will be equations of #Sigma#, with the terms compiled as will be described momentarily, plus for every pair of sorts the equations that describe pairs and the equations for #!#. Finally, we need to describe how to take a term of the signature and make a function symbol of the theory, but before we do that we need to explain how to convert those sources of the terms which are lists. That’s just a conversion to right nested pairs, #[A_0,...,A_n] |-> A_0 xx (... xx (A_n xx bb1) ... )#. The compilation of a term #t#, which we’ll write as #ccC[t]#, is defined as follows:

- #ccC[bbx_i^[A_0, A_1, ..., A_n]] = tt "snd"^i(tt "fst"(bbx_(A_i xx(...))))# where #tt "snd"^i# means the #i#-fold application of #tt "snd"#
- #ccC[(::)^A] = !_A#
- #ccC[(: t_1, t_2 :)] = (: ccC[t_1], ccC[t_2] :)#
- #ccC[tt "fst"(t)] = tt "fst"(ccC[t])#
- #ccC[tt "snd"(t)] = tt "snd"(ccC[t])#
- #ccC[tt "f"(t_1, ..., t_n)] = tt "f"((: ccC[t_1], (: ... , (: ccC[t_n], ! :) ... :) :))#

As you may have noticed, the generated theory will have an infinite number of sorts, an infinite number of function symbols, and an infinite number of equations no matter what the signature is — even an empty one! Having an infinite number of things isn’t a problem as long as we can algorithmically describe them and this is what the signature provides. Of course, if you’re a (typical) mathematician you nominally don’t care about an algorithmic description. Besides being compact, signatures present a nicer term language. The theories are like a core or assembly language. We could define a slightly nicer variation where we keep a context and manage named variables leading to terms-in-context like:

#x:A, y:B |-- tt "f"(x, x, y)#

which is

#tt "f"(bbx_0^[A,B], bbx_0^[A,B], bbx_1^[A,B])#

for our current term language for signatures. Of course, compilation will be (slightly) trickier for the nicer language.

The benefit of having compiled the signature to a theory, in addition to being able to reuse the results we’ve established for theories, is we only need to define operations on the theory, which is simpler since we only need to deal with pairs and unary function symbols. One example of this is we’d like to extend our notion of interpretation to one that respects the structure of the signature, and we can do that by defining an interpretation of theories that respects finite products.

A **finite product preserving interpretation (into a finite product theory)**, #ccI#, is an interpretation (into a finite product theory) that additionally satisfies:

- #⟦bb1⟧^ccI = bb1#
- #⟦A xx B⟧^ccI = ⟦A⟧^ccI xx ⟦B⟧^ccI#

- #⟦!_A⟧^ccI = !_(⟦A⟧^ccI)#
- #⟦tt "fst"(t)⟧^ccI = tt "fst"(⟦t⟧^ccI)#
- #⟦tt "snd"(t)⟧^ccI = tt "snd"(⟦t⟧^ccI)#
- #⟦(: t_1, t_2 :)⟧^ccI = (: ⟦t_1⟧^ccI, ⟦t_2⟧^ccI :)#

where, for #bb "Set"#, #bb1 -= {{}}#, #xx# is the cartesian product, #tt "fst"# and #tt "snd"# are the projections, #!_A -= x |-> \{\}#, and #(: f, g :) -= x |-> (: f(x), g(x) :)#.

With signatures, we can return to our theory, now signature, of groups. #Sigma_bb "Grp"# has a single sort #S#, three function symbols #tte : [bb1] -> S#, #tti : [S] -> S#, and #ttm : [S, S] -> S#, with the following equations (written as equations rather than pairs):

- #tt "m"(tt "e"((::)^S), bbx_0^S) = bbx_0^S#
- #tt "m"(tt "i"(bbx_0^S), bbx_0^S) = tt "e"((::)^S)#
- #tt "m"(tt "m"(bbx_0^[S,S,S], bbx_1^[S,S,S]), bbx_2^[S,S,S]) = tt "m"(bbx_0^[S,S,S], tt "m"(bbx_1^[S,S,S], bbx_2^[S,S,S]))#

or using the nicer syntax:

- #x:S |-- tt "m"(tt "e"(), x) = x#
- #x:S |-- tt "m"(tt "i"(x), x) = tt "e"()#
- #x:S, y:S, z:S |-- tt "m"(tt "m"(x, y), z) = tt "m"(x, tt "m"(y, z))#

An actual group is then just a finite product preserving interpretation of (the theory generated by) this signature. All of universal algebra and much of abstract algebra can be formulated this way.

The Simply Typed Lambda Calculus and BeyondWe can consider additionally assuming that our theory has exponentials. I left articulating exactly what that means as an exercise, but the upshot is we have the following two operations:

For any term #t : A xx B -> C#, we have the term #tt "curry"(t) : A -> C^B#. We also have the homomorphism #tt "app"_(AB) : B^A xx A -> B#. They satisfy:

- #tt "curry"(tt "app"(bbx_(B^A xx A))) = bbx_(B^A)#
- #tt "app"((: tt "curry"(t_1), t_2 :)) = t_1((: bbx_A, t_2 :))# where #t_1 : A xx B -> C# and #t_2 : A -> B#.

We can view these, together with the the product operations, as combinators and it turns out we can compile the simply typed lambda calculus into the above theory. This is exactly what the Categorical Abstract Machine did. The “Caml” in “O’Caml” stands for “Categorical Abstract Machine Language”, though O’Caml no longer uses the CAM. Conversely, every term of the theory can be expressed as a simply typed lambda term. This means we can view the simply typed lambda calculus as just a different presentation of the theory.

At this point, this presentation of category theory starts to connect to the mainstream categorical literature on universal algebra, internal languages, sketches, and internal logic. This page gives a synopsis of the relationship between type theory and category theory. For some reason, it is unusual to talk about the internal language of a plain category, but that is exactly what we’ve done here.

I haven’t talked about finite limits or colimits beyond products and coproducts, nor have I talked about even the infinitary versions of products and coproducts, let alone arbitrary limits and colimits. These can be handled the same way as products and coproducts. Formulating a language like signatures or the simply typed lambda calculus is a bit more complicated, but not that hard. I may make a follow-up article covering this among other things. I also have a side project (don’t hold your breath), that implements the internal language of a category with finite limits. The result looks roughly like a simple version of an algebraic specification language like the OBJ family. The RING theory described in the Maude manual gives an idea of what it would look like. In fact, here’s an example of the current actual syntax I’m using.3

theory Categories type O type A given src : A -> O given tgt : A -> O given id : O -> A satisfying o:O | src (id o) = o, tgt (id o) = o given c : { f:A, g:A | src f = tgt g } -> A satisfying (f, g):{ f:A, g:A | src f = tgt g } | tgt (c (f, g)) = tgt f, src (c (f, g)) = src g satisfying "left unit" (o, f):{ o:O, f:A | tgt f = o } | c (id o, f) = f satisfying "right unit" (o, f):{ o:O, f:A | src f = o } | c (f, id o) = f satisfying "associativity" (f, g, h):{ f:A, g:A, h:A | src f = tgt g, src g = tgt h } | c (c (f, g), h) = c (f, c (g, h)) endtheoryIt turns out this is a particularly interesting spot in the design space. The fact that the theory of theories with finite limits is itself a theory with finite limits has interesting consequences. It is still relatively weak though. For example, it’s not possible to describe the theory of fields in this language.

There are other directions one could go. For example, the internal logic of monoidal categories is (a fragment of) ordered linear logic. You can cross this bridge either way. You can look at different languages and consider what categorical structure is needed to support the features of the language, or you can add features to the category and see how that impacts the internal language. The relationship is similar to the source language and a core/intermediate language in a compiler, e.g. GHC Haskell and System Fc.

DecoderIf you’ve looked at category theory at all, you can probably make most of the connections without me telling you. The table below outlines the mapping, but there are some subtleties. First, as a somewhat technical detail, my definition of a theory corresponds to a *small* category, i.e. a category which has a set of objects and a set of arrows. For more programmer types, you should think of “set” as Set in Agda, i.e. similar to the * kind in Haskell. Usually “category” means “locally small category” which may have a proper class of objects and between any two objects a set of arrows (though the union of all those sets may be a proper class). Again, for programmers, the distinction between “class” and “set” is basically the difference between Set and Set1 in Agda.4 To make my definition of theory closer to this, all that is necessary is instead of having a set of function symbols, have a family of sets indexed by pairs of objects. Here’s what a partial definition in Agda of the two scenarios would look like:

The benefit of the notion of locally small category is that Set itself is a locally small category. The distinction I was making between interpretations into theories and interpretations into **Set** was due to the fact that **Set** wasn’t a theory. If I used a definition theory corresponding to a locally small category, I could have combined the notions of interpretation by making **Set** a theory. The notion of a small category, though, is still useful. Also, an interpretation into **Set** corresponds to the usual notion of a model or semantics, while interpretations into other theories was a less emphasized concept in traditional model theory and universal algebra.

A less technical and more significant difference is that my definition of a theory doesn’t correspond to a category, but rather to a *presentation* of a category, from which a category can be generated. The analog of arrows in a category is *terms*, not function symbols. This is a bit more natural route from the model theory/universal algebra/programming side. Similarly, having an explicit collection of equations, rather than just an equivalence relation on terms is part of the presentation of the category but not part of the category itself.

**Set**, a (co)presheaf interpretation into a theory functor homomorphism natural transformation simply typed lambda calculus (with products) a cartesian closed category Conclusion

In some ways I’ve stopped just when things were about to get good. I may do a follow-up to elaborate on this good stuff. Some examples are: if I expand the definition so that **Set** becomes a “theory”, then interpretations also form such a “theory”, and these are often what we’re really interested in. The category of finite-product preserving interpretations of the theory of groups essentially *is* the category of groups. In fact, universal algebra is, in categorical terms, just the study of categories with finite products and finite-product preserving functors from them, particularly into **Set**. It’s easy to generalize this in many directions. It’s also easy to make very general definitions, like a general definition of a free algebraic structure. In general, we’re usually more interested in the interpretations of a theory than the theory itself.

While I often do advocate thinking in terms of internal languages of categories, I’m not sure that it is a preferable perspective for the very basics of category theory. Nevertheless, there are a few reasons for why I wrote this. First, this very syntactical approach is, I think, more accessible to someone coming from a programming background. From this view, a category is a very simple programming language. Adding structure to the category corresponds to adding features to this programming language. Interpretations are denotational semantics.

Another aspect about this presentation that is quite different is the use and emphasis on collages. Collages correspond to profunctors, a crucially important and enabling concept that is rarely covered in categorical introductions. The characterization of profunctors as collages in Vaughn Pratt’s paper (not using that name) was one of the things I enjoyed about that paper and part of what prompted me to start writing this. In earlier, drafts of this article, I was unable to incorporate collages in a meaningful way as I was trying to start from profunctors. This approach just didn’t add value. Collages just looked like a bizarre curio and weren’t integrated into the narrative at all. For other reasons, though, I ended up revisiting the idea of a heteromorphism. My (fairly superficial) opinion is that once you have the notion of functors and natural transformations, adding the notion of heteromorphisms has a low power-to-weight ratio, though it does make some things a bit nicer. Nevertheless, in thinking of how best to fit them into this context, it was clear that collages provided the perfect mechanism (which isn’t a big surprise), and the result works rather nicely. When I realized a fact that can be cryptically but compactly represented as #ccK_ccT ≃ bbbI xx ccT# where #bbbI# is the interval category, i.e. two objects with a single arrow joining them, I realized that this is actually an interesting perspective. Since most of this article was written at that point, I wove collages into the narrative replacing some things. If, though, I had started with this perspective from the beginning I suspect I would have made a significantly different article, though the latter sections would likely be similar.

It’s actually better to organize this as a

*family*of collections of function symbols indexed by pairs of sorts.↩Instead of having equations that generate an equivalence relation on (raw) terms, we could simply require an equivalence relation on (raw) terms be directly provided.↩

Collaging is actually quite natural in this context. I already intend to support one theory importing another. A collage is just a theory that imports two others and then adds function symbols between them.↩

For programmers familiar with Agda, at least, if you haven’t made this connection, this might help you understand and appreciate what a “class” is versus a “set” and what “size issues” are, which is typically handled extremely vaguely in a lot of the literature.↩

### Mark Jason Dominus: Imaginary Albanian eggplant festivals… IN SPACE

Wikipedia has a list of harvest festivals which includes this intriguing entry:

*Ysolo*: festival marking the first day of harvest of eggplants in Tirana, Albania

(It now says “citation needed“; I added that yesterday.)

I am confident that this entry, inserted in July 2012 by an anonymous user, is a hoax. When I first read it, I muttered “Oh, what bullshit,” but then went looking for a reliable source, because you never know. I have occasionally been surprised in the past, but this time I found clear evidence of a hoax: There are only a couple of scattered mentions of Ysolo on a couple of blogs, all from after 2012, and nothing at all in Google Books about Albanian eggplant celebrations. Nor is there an article about it in Albanian Wikipedia.

But reality gave ground before I arrived on the scene. Last September
NASA's
*Dawn* spacecraft
visited
the dwarf planet Ceres.
Ceres is named for the Roman goddess of the harvest, and so NASA
proposed harvest-related names for Ceres’ newly-discovered physical
features. It appears that someone at NASA ransacked the Wikipedia
list of harvest festivals without checking whether they were real,
because there is now
a large mountain at Ceres’ north pole whose *official* name is Ysolo Mons,
named for this spurious eggplant festival. (See also:
NASA JPL press release;
USGS Astrogeology Science Center announcement.)

To complete the magic circle of fiction, the Albanians might begin to
celebrate the previously-fictitious eggplant festival. (And why not?
Eggplants are lovely.) Let them do it for a couple of years, and then
Wikipedia could document the *real* eggplant festival… Why not fall
under the spell of Tlön and submit to the minute and vast evidence of
an ordered planet?

Happy Ysolo, everyone.

### Dominic Steinitz: Mercator: A Connection with Torsion

In most presentations of Riemannian geometry, e.g. O’Neill (1983) and Wikipedia, the fundamental theorem of Riemannian geometry (“the miracle of Riemannian geometry”) is given: that for any semi-Riemannian manifold there is a unique torsion-free metric connection. I assume partly because of this and partly because the major application of Riemannian geometry is General Relativity, connections with torsion are given little if any attention.

It turns out we are all very familiar with a connection with torsion: the Mercator projection. Some mathematical physics texts, e.g. Nakahara (2003), allude to this but leave the details to the reader. Moreover, this connection respects the metric induced from Euclidean space.

We use SageManifolds to assist with the calculations. We hint at how this might be done more slickly in Haskell.

A Cartographic Aside %matplotlib inline /Applications/SageMath/local/lib/python2.7/site-packages/traitlets/traitlets.py:770: DeprecationWarning: A parent of InlineBackend._config_changed has adopted the new @observe(change) API clsname, change_or_name), DeprecationWarning) import matplotlib import numpy as np import matplotlib.pyplot as plt import cartopy import cartopy.crs as ccrs from cartopy.mpl.ticker import LongitudeFormatter, LatitudeFormatter plt.figure(figsize=(8, 8)) ax = plt.axes(projection=cartopy.crs.Mercator()) ax.gridlines() ax.add_feature(cartopy.feature.LAND) ax.add_feature(cartopy.feature.COASTLINE) plt.show()We can see Greenland looks much broader at the North than in the middle. But if we use a polar projection (below) then we see this is not the case. Why then is the Mercator projection used in preference to e.g. the polar projection or the once controversial Gall-Peters – see here for more on map projections.

plt.figure(figsize=(8, 8)) bx = plt.axes(projection=cartopy.crs.NorthPolarStereo()) bx.set_extent([-180, 180, 90, 50], ccrs.PlateCarree()) bx.gridlines() bx.add_feature(cartopy.feature.LAND) bx.add_feature(cartopy.feature.COASTLINE) plt.show() ColophonThis is written as an Jupyter notebook. In theory, it should be possible to run it assuming you have installed at least sage and Haskell. To publish it, I used

jupyter-nbconvert --to markdown Mercator.ipynb pandoc -s Mercator.md -t markdown+lhs -o Mercator.lhs \ --filter pandoc-citeproc --bibliography DiffGeom.bib BlogLiteratelyD --wplatex Mercator.lhs > Mercator.htmlNot brilliant but good enough.

Some commands to jupyter to display things nicely.

%display latex viewer3D = 'tachyon' Warming Up With SageManifoldsLet us try a simple exercise: finding the connection coefficients of the Levi-Civita connection for the Euclidean metric on in polar co-ordinates.

Define the manifold.

N = Manifold(2, 'N',r'\mathcal{N}', start_index=1)Define a chart and frame with Cartesian co-ordinates.

ChartCartesianN.<x,y> = N.chart() FrameCartesianN = ChartCartesianN.frame()Define a chart and frame with polar co-ordinates.

ChartPolarN.<r,theta> = N.chart() FramePolarN = ChartPolarN.frame()The standard transformation from Cartesian to polar co-ordinates.

cartesianToPolar = ChartCartesianN.transition_map(ChartPolarN, (sqrt(x^2 + y^2), arctan(y/x))) print(cartesianToPolar) Change of coordinates from Chart (N, (x, y)) to Chart (N, (r, theta)) print(latex(cartesianToPolar.display()))cartesianToPolar.set_inverse(r * cos(theta), r * sin(theta)) Check of the inverse coordinate transformation: x == x y == y r == abs(r) theta == arctan(sin(theta)/cos(theta))

Now we define the metric to make the manifold Euclidean.

g_e = N.metric('g_e') g_e[1,1], g_e[2,2] = 1, 1We can display this in Cartesian co-ordinates.

print(latex(g_e.display(FrameCartesianN)))

And we can display it in polar co-ordinates

print(latex(g_e.display(FramePolarN)))

Next let us compute the Levi-Civita connection from this metric.

nab_e = g_e.connection() print(latex(nab_e))

If we use Cartesian co-ordinates, we expect that . Only non-zero entries get printed.

print(latex(nab_e.display(FrameCartesianN)))Just to be sure, we can print out all the entries.

print(latex(nab_e[:]))

In polar co-ordinates, we get

print(latex(nab_e.display(FramePolarN)))

Which we can rew-rewrite as

with all other entries being 0.

The SphereWe define a 2 dimensional manifold. We call it the 2-dimensional (unit) sphere but it we are going to remove a meridian to allow us to define the desired connection with torsion on it.

S2 = Manifold(2, 'S^2', latex_name=r'\mathbb{S}^2', start_index=1) print(latex(S2))

To start off with we cover the manifold with two charts.

polar.<th,ph> = S2.chart(r'th:(0,pi):\theta ph:(0,2*pi):\phi'); print(latex(polar))mercator.<xi,ze> = S2.chart(r'xi:(-oo,oo):\xi ze:(0,2*pi):\zeta'); print(latex(mercator))

We can now check that we have two charts.

print(latex(S2.atlas()))

We can then define co-ordinate frames.

epolar = polar.frame(); print(latex(epolar))emercator = mercator.frame(); print(latex(emercator))

And define a transition map and its inverse from one frame to the other checking that they really are inverses.

xy_to_uv = polar.transition_map(mercator, (log(tan(th/2)), ph)) xy_to_uv.set_inverse(2*arctan(exp(xi)), ze) Check of the inverse coordinate transformation: th == 2*arctan(sin(1/2*th)/cos(1/2*th)) ph == ph xi == xi ze == zeWe can define the metric which is the pullback of the Euclidean metric on .

g = S2.metric('g') g[1,1], g[2,2] = 1, (sin(th))^2And then calculate the Levi-Civita connection defined by it.

nab_g = g.connection() print(latex(nab_g.display()))

We know the geodesics defined by this connection are the great circles.

We can check that this connection respects the metric.

print(latex(nab_g(g).display()))

And that it has no torsion.

print(latex(nab_g.torsion().display())) 0 A New ConnectionLet us now define an orthonormal frame.

ch_basis = S2.automorphism_field() ch_basis[1,1], ch_basis[2,2] = 1, 1/sin(th) e = S2.default_frame().new_frame(ch_basis, 'e') print(latex(e))

We can calculate the dual 1-forms.

dX = S2.coframes()[2] ; print(latex(dX))print(latex((dX[1], dX[2])))

print(latex((dX[1][:], dX[2][:])))

In this case it is trivial to check that the frame and coframe really are orthonormal but we let sage do it anyway.

print(latex(((dX[1](e[1]).expr(), dX[1](e[2]).expr()), (dX[2](e[1]).expr(), dX[2](e[2]).expr()))))

Let us define two vectors to be parallel if their angles to a given meridian are the same. For this to be true we must have a connection with .

nab = S2.affine_connection('nabla', latex_name=r'\nabla') nab.add_coef(e)Displaying the connection only gives the non-zero components.

print(latex(nab.display(e)))For safety, let us check all the components explicitly.

print(latex(nab[e,:]))

Of course the components are *not* non-zero in other frames.

print(latex(nab.display(emercator)))

This connection also respects the metric .

print(latex(nab(g).display()))

Thus, since the Levi-Civita connection is unique, it *must* have torsion.

The equations for geodesics are

Explicitly for both variables in the polar co-ordinates chart.

We can check that and are solutions although sage needs a bit of prompting to help it.

t = var('t'); a = var('a') print(latex(diff(a * log(tan(t/2)),t).simplify_full()))

We can simplify this further by recalling the trignometric identity.

print(latex(sin(2 * t).trig_expand()))print(latex(diff (a / sin(t), t)))

In the mercator co-ordinates chart this is

In other words: straight lines.

Reparametersing with we obtain

Let us draw such a curve.

R.<t> = RealLine() ; print(R) Real number line R print(dim(R)) 1 c = S2.curve({polar: [2*atan(exp(-t/10)), t]}, (t, -oo, +oo), name='c') print(latex(c.display()))c.parent() c.plot(chart=polar, aspect_ratio=0.1)

It’s not totally clear this is curved so let’s try with another example.

d = S2.curve({polar: [2*atan(exp(-t)), t]}, (t, -oo, +oo), name='d') print(latex(d.display()))d.plot(chart=polar, aspect_ratio=0.2)

Now it’s clear that a *straight* line is curved in polar co-ordinates.

But of course in Mercator co-ordinates, it is a straight line. This explains its popularity with mariners: if you draw a straight line on your chart and follow that bearing or rhumb line using a compass you will arrive at the end of the straight line. Of course, it is not the shortest path; great circles are but is much easier to navigate.

c.plot(chart=mercator, aspect_ratio=0.1) d.plot(chart=mercator, aspect_ratio=1.0)We can draw these curves on the sphere itself not just on its charts.

R3 = Manifold(3, 'R^3', r'\mathbb{R}^3', start_index=1) cart.<X,Y,Z> = R3.chart(); print(latex(cart))Phi = S2.diff_map(R3, { (polar, cart): [sin(th) * cos(ph), sin(th) * sin(ph), cos(th)], (mercator, cart): [cos(ze) / cosh(xi), sin(ze) / cosh(xi), sinh(xi) / cosh(xi)] }, name='Phi', latex_name=r'\Phi')

We can either plot using polar co-ordinates.

graph_polar = polar.plot(chart=cart, mapping=Phi, nb_values=25, color='blue') show(graph_polar, viewer=viewer3D)Or using Mercator co-ordinates. In either case we get the sphere (minus the prime meridian).

graph_mercator = mercator.plot(chart=cart, mapping=Phi, nb_values=25, color='red') show(graph_mercator, viewer=viewer3D)We can plot the curve with an angle to the meridian of

graph_c = c.plot(mapping=Phi, max_range=40, plot_points=200, thickness=2) show(graph_polar + graph_c, viewer=viewer3D)And we can plot the curve at angle of to the meridian.

graph_d = d.plot(mapping=Phi, max_range=40, plot_points=200, thickness=2, color="green") show(graph_polar + graph_c + graph_d, viewer=viewer3D) HaskellWith automatic differentiation and symbolic numbers, symbolic differentiation is straigtforward in Haskell.

> import Data.Number.Symbolic > import Numeric.AD > > x = var "x" > y = var "y" > > test xs = jacobian ((\x -> [x]) . f) xs > where > f [x, y] = sqrt $ x^2 + y^2 ghci> test [1, 1] [[0.7071067811865475,0.7071067811865475]] ghci> test [x, y] [[x/(2.0*sqrt (x*x+y*y))+x/(2.0*sqrt (x*x+y*y)),y/(2.0*sqrt (x*x+y*y))+y/(2.0*sqrt (x*x+y*y))]]Anyone wishing to take on the task of producing a Haskell version of sagemanifolds is advised to look here before embarking on the task.

Appendix A: Conformal EquivalenceAgricola and Thier (2004) shows that the geodesics of the Levi-Civita connection of a conformally equivalent metric are the geodesics of a connection with vectorial torsion. Let’s put some but not all the flesh on the bones.

The **Koszul formula** (see e.g. (O’Neill 1983)) characterizes the Levi-Civita connection

Being more explicit about the metric, this can be re-written as

Let be the Levi-Civita connection for the metric where . Following [Gadea2010] and substituting into the Koszul formula and then applying the product rule

Where as usual the vector field, for , is defined via .

Let’s try an example.

nab_tilde = S2.affine_connection('nabla_t', r'\tilde_{\nabla}') f = S2.scalar_field(-ln(sin(th)), name='f') for i in S2.irange(): for j in S2.irange(): for k in S2.irange(): nab_tilde.add_coef()[k,i,j] = \ nab_g(polar.frame()[i])(polar.frame()[j])(polar.coframe()[k]) + \ polar.frame()[i](f) * polar.frame()[j](polar.coframe()[k]) + \ polar.frame()[j](f) * polar.frame()[i](polar.coframe()[k]) + \ g(polar.frame()[i], polar.frame()[j]) * \ polar.frame()[1](polar.coframe()[k]) * cos(th) / sin(th) print(latex(nab_tilde.display()))print(latex(nab_tilde.torsion().display())) 0 g_tilde = exp(2 * f) * g print(latex(g_tilde.parent()))

print(latex(g_tilde[:]))

nab_g_tilde = g_tilde.connection() print(latex(nab_g_tilde.display()))

It’s not clear (to me at any rate) what the solutions are to the geodesic equations despite the guarantees of Agricola and Thier (2004). But let’s try a different chart.

print(latex(nab_g_tilde[emercator,:]))

In this chart, the geodesics are clearly straight lines as we would hope.

ReferencesAgricola, Ilka, and Christian Thier. 2004. “The geodesics of metric connections with vectorial torsion.” *Annals of Global Analysis and Geometry* 26 (4): 321–32. doi:10.1023/B:AGAG.0000047509.63818.4f.

Nakahara, M. 2003. “Geometry, Topology and Physics.” *Text* 822: 173–204. doi:10.1007/978-3-642-14700-5.

O’Neill, B. 1983. *Semi-Riemannian Geometry with Applications to Relativity, 103*. Pure and Applied Mathematics. Elsevier Science. https://books.google.com.au/books?id=CGk1eRSjFIIC.

### Yesod Web Framework: The New (Experimental) Yesod Development Server - Feedback Requested!

I'm guessing almost every Yesod user has - at some point - used the venerable yesod devel command, which launches a server which auto-recompiles your source code on any file changes. This has been a core part of the Yesod ecosystem for many years. Unfortunately, it's had to be far more complicated than I'd have liked:

- Since it predates addDependentFile (good work Greg Weber on getting that in!), it has some pretty complex logic around guessing which external files (like Hamlet files) should force a recompile. (Adding support for addDependentFile to the current yesod devel is possible, but it's a non-trivial undertaking.)
- In order to ensure a consistent set of dependencies, it does some real fancy footwork around intercepting arguments passed to ghc and linker executables.
- In order to parse various files, it links against the ghc library, tying it to a specific compiler version. This makes things difficult for users (don't accidentally use yesod from GHC 7.10.3 with GHC 8.0.1!), and sometimes really painful for maintainers.

For a few months now, I've been meaning to greatly simplify yesod devel, but the maintenance burden finally gave me the excuse I needed to bite the bullet and do it. The result is a dramatic simplification of the code base. First I'd like to ask for user feedback, and then I'll discuss some of the details of implementation.

Please try it out!Since this is such a big change, I'd really appreciate if others could give this a shot before I release it. There are two ways you can do this:

Method 1:

- git clone https://github.com/yesodweb/yesod --branch 1304-stack-based-devel yesod-new-devel
- cd yesod-new-devel
- stack install yesod-bin
- From your project directory, run yesod devel. NOTE: do
*not*use stack exec -- yesod devel, you want to use the newly globally installed executable, not the one from your snapshot!

Method 2:

Add the following to your stack.yaml file's packages list:

- location: git: https://github.com/yesodweb/yesod commit: f3fc735a25eb3d5c051c761b59070eb9a0e4e156 subdirs: - yesod-bin extra-dep: trueLikely: add the following to your stack.yaml file's extra-deps list:

- say-0.1.0.0 - typed-process-0.1.0.0- stack build yesod-bin
- stack exec -- yesod devel

Use whichever method you feel most comfortable with. Please let me know both successes and failures, and then I'll try to get this rolled out. Comments would be great on the Github pull request. So far, in my limited testing, I've found that the new yesod devel runs faster than the current one, but that could very much be confirmation bias speaking.

Note: there are a few removed features in this update, please see the changelog.

How it worksThe big change - as the branch name implies - was depending entirely on Stack for all of the heavy lifting. Stack already provides a --file-watch command to automatically recompile, and uses GHC's own addDependentFile information to track external file dependencies. This cuts out the vast majority of the complexity. There's no longer any need to depend on the ghc library, there's less Cabal library code involved (making cross-version support much simpler), and almost everything is handled by shelling out to external executables.

I also got to redo the concurrency aspects of this using my absolute favorite package in the world: async. The result is, in my opinion, very straightforward. I also leveraged some of the newer libraries I've worked on, like safe-exceptions, typed-process, and say.

The code is (finally) well commented, so you can jump in and look yourself. I've also added a decent README, and an example of using yesod devel with a non-Yesod project.

### Neil Mitchell: The Haskell.org Website Working Group (HWWG)

The goal of the Haskell Website Working Group is to make sure the Haskell website caters to the needs of Haskell programmers, particularly beginners. In doing so we hope to either combine or differentiate haskell.org and haskell-lang.org, and give people clear recommendations of what "downloading Haskell" means. Concretely, we hope that either haskell-lang.org redirects to haskell.org, or that haskell-lang.org ends up being used for something very different from today.

The Haskell Website Working Group (HWWG)Scope and goals

- The HWWG is responsible for the design and content of the user-facing haskell.org web site, including tutorials, videos, news, resource, downloads, etc.

The HWWG is not responsible for:- The infrastructure of haskell.org
- Toolchains, Hackage, compilers, etc

- The HWWG focuses on serving users of Haskell, not suppliers of technology or libraries.
- An explicit goal is to re-unite the haskell.org and haskell-lang.org web sites.

- HWWG is not responsible for actually doing everything! The web site is on github. Anyone can make a pull request. The general expectation is that uncontroversial changes will be accepted and committed without much debate.
- If there is disagreement about a proposed change, it's up to the HWWG to engage in (open) debate, and to seek input as widely as time allows, but finally to make a decision.

Initial membership comprises of:

- Neil Mitchell (chair)
- Nicolas Wu
- Andrew Cowie
- Vincent Hanquez
- Ryan Trinkle
- Chris Done

It is expected the committee will change over time, but the mechanism has not yet been thought about.

Rules of engagement- Recognising that honestly-held judgements may differ, we will be scrupulously polite both in public and in private.
- Recognising that Haskell has many users, and that different users have different needs and tastes, we want haskell.org to be inclusive rather than exclusive, providing a variety of alternative resources (toolchains, tutorials, books, etc) clearly signposted with their intended audiences.
- Ultimately the haskell.org committee owns the haskell.org URL, but it delegates authority for the design and content of the web site to the HWWG. In extremis, if the haskell.org committee believes that the HWWG is mismanaging the web site, it can revoke that delegation.

### Joachim Breitner: microG on Jolla

I am a incorrigibly in picking non-mainstream, open smartphones, and then struggling hard. Back then in 2008, I tried to use the OpenMoko FreeRunner, but eventually gave up because of hardware glitches and reverted to my good old Siemens S35. It was not that I would not be willing to put up with inconveniences, but as soon as it makes live more difficult for the people I communicate with, it becomes hard to sustain.

Two years ago I tried again, and got myself a Jolla phone, running Sailfish OS. Things are much nicer now: The hardware is mature, battery live is good, and the Android compatibility layer enables me to run many important apps that are hard to replace, especially the Deutsche Bahn Navigator and various messengers, namely Telegram, Facebook Messenger, Threema and GroupMe.

Some apps that require Google Play Services, which provides a bunch of common tasks and usually comes with the Google Play store would not run on my phone, as Google Play is not supported on Sailfish OS. So far, the most annoying ones of that sort were Uber and Lyft, making me pay for expensive taxis when others would ride cheaper, but I can live with that. I tried to install Google Play Services from shady sources, but it would regularly crash.

Signal on JollaNow in Philadelphia, people urged me to use the Signal messenger, and I was convinced by its support for good end-to-end crypto, while still supporting offline messages and allowing me to switch from my phone to my desktop and back during a conversation. The official Signal app uses Google Cloud Messaging (GCM, part of Google Play Services) to get push updates about new posts, and while I do not oppose this use of Google services (it really is just a ping without any metadata), this is a problem on Sailfish OS.

Luckily, the Signal client is open source, and someone created a “LibreSignal” edition that replaced the use of GCM with websockets, and indeed, this worked on my phone, and I could communicate.

Things were not ideal, though: I would often have to restart the app to get newly received messages; messages that I send via Signal Desktop would often not show up on the phone and, most severe, basically after every three messages, sending more messages from Desktop would stop working for my *correspondents*, which freaked them out. (Strangely it continued working from their phone app, so we coped for a while.)

So again, my choice of non-standard devices causes inconveniences to others. This, and the fact that the original authors of Signal and the maintainers of LibreSignal got into a fight that ended LibreSignal discontinued, meant that I have to change something about this situation. I was almost ready to give in and get myself a Samsung S7 or something boring of the sort, but then I decided to tackle this issue once more, following some of the more obscure instructions out there, trying to get vanilla Signal working on my phone. About a day later, I got it, and this is how I did it.

microGSo I need Google Play Services somehow, but installing the “real thing” did not seem to be very promising (I tried, and regularly got pop-ups telling me that Play Services has crashed.) But I found some references to a project called “microG”, which is an independent re-implementation of (some of) of the play services, in particular including GCM.

Installing microG itself was easy, as you can add their repository to F-Droid. I installed the core services, the services framework and the fake store apps. If this had been all that was to do, things would be easy!

Play Store detection work aroundsBut Signal would still complain about the lack of Google Play Services. It asks Android if an app with a certain name is installed, and would refuse to work if this app does not exist. For some reason, the microG apps cannot just have the names of the “real” Google apps.

There seem to be two ways of working around this: Patching Signal, or enabling Signature Spoofing.

The initially most promising instructions (which are in a README in a tarball on a fishy file hoster linked from an answer on the Jolla support forum…) suggested patching Signal, and actually came both with a version of an app called “Lucky Patcher” as well as a patched Android package, but both about two years old. I tried a recent version of the Lucky Patcher, but it failed to patch the current version of Signal.

Signature SpoofingSo on to Signature Spoofing. This is a feature of some non-standard Android builds that allow apps (such as microG) to fake the existence of other apps (the Play Store), and is recommended by the microG project. Sailfish OS’s Android compatibility layer “Alien Dalvik” does not support it out of the box, but there is a tool “tingle” that adds this feature to existing Android systems. One just has to get the /system/framework/framework.jar file, put it into the input folder of this project, run python main.py, select 2, and copy the framework.jar from output/ back. Great.

Deodexing Alien DalvikOnly that it only works on “deodexed” files. I did not know anything about odexed Android Java classes (and did not really want to know), but there was not way around. Following this explanation I gathered that one finds files foo.odex in the Android system folder, runs some tool on them to create a classes.dex file, and adds that to the corresponding foo.jar or foo.apk file, copies this back to the phone and deletes the foo.odex file.

The annoying this is that one does not only have to do it for framework.jar in order to please tingle, because if one does it to one odex file, one has to do to all! It seems that for people using Windows, the Universal Deodexer V5 seems to be a convenient tool, but I had to go more manually.

So I first fetched “smali”, compiled it using ./gradlew build. Then I fetched the folders /opt/alien/system/framework and /opt/alien/system/app from the phone (e.g. using scp). Keep a backup of these in case something breaks. Then I ran these commands (disclaimer: I fetched these from my bash history and slightly cleaned them up. This is *not* a fire-and-forget script! Use it when you know what it and you are doing):

The resulting framework.jar can now be patched with tingle:

mv framework/framework.jar ~/build/tingle/input cd ~/build/tingle ./main.py # select 2 cd - mv ~/build/tingle/output/framework.jar framework/framework.jarNow I copy these framework and app folders back on my phone, and restart Dalvik:

devel-su systemctl restart aliendalvik.serviceIt might start a bit slower than usually, but eventually, all the Android apps should work as before.

The final bit that was missing in my case was that I had to reinstall Signal: If it is installed before microG is installed, it does not get permission to use GCM, and when it tries (while registering: After generating the keys) it just crashes. I copied /data/data/org.thoughtcrime.secretsms/ before removing Signal and moved it back after (with cp -a to preserve permissions) so that I could keep my history.

And now, it seems, vanilla Signal is working just fine on my Jolla phone!

What’s missingAm I completely happy with Signal? No! An important feature that it is lacking is a way to get out all data (message history including media files) in a file format that can be read without Signal; e.g. YAML files or clean HTML code. I do want to be able to re-read some of the more interesting conversations when I am 74 or 75, and I doubt that there will be a Signal App, or even Android, then. I hope that this becomes available in time, maybe in the Desktop version.

I would also hope that pidgin gets support to the Signal protocol, so that I conveniently use one program for all my messaging needs on the desktop.

Finally it would be nice if my Signal identity was less tied to one phone number. I have a German and a US phone number, and would want to be reachable under both on all my clients. (If you want to contact me on Signal, use my US phone number.)

AlternativesCould I have avoided this hassle by simply convincing people to use something other than Signal? Tricky, at the moment. Telegram (which works super reliable for me, and has a pidgin plugin) has dubious crypto and does not support crypto while using multiple clients. Threema has no desktop client that I know of. OTR on top of Jabber does not support offline messages. So nothing great seems to exist right now.

In the long run, the best bet seems to be OMEMO (which is, in essence, the Signal protocol) on top of Jabber. It is currently supported by one Android Jabber client (Conversations) and one Desktop application (gajim, via a plugin). I should keep an eye on pidgin support for OMEMO and other development around this.

### Polymorphism, subtyping and type inference in MLsub

I am very enthusiastic about the following paper: it brings new ideas and solves a problem that I did not expect to be solvable, namely usable type inference when both polymorphism and subtyping are implicit. (By "usable" here I mean that the inferred types are both compact and principal, while previous work generally had only one of those properties.)

Polymorphism, Subtyping, and Type Inference in MLsub

Stephen Dolan and Alan Mycroft

2017

We present a type system combining subtyping and ML-style parametric polymorphism. Unlike previous work, our system supports type inference and has compact principal types. We demonstrate this system in the minimal language MLsub, which types a strict superset of core ML programs.

This is made possible by keeping a strict separation between the types used to describe inputs and those used to describe outputs, and extending the classical unification algorithm to handle subtyping constraints between these input and output types. Principal types are kept compact by type simplification, which exploits deep connections between subtyping and the algebra of regular languages. An implementation is available online.

The paper is full of interesting ideas. For example, one idea is that adding type variables to the base grammar of types -- instead of defining them by their substitution -- forces us to look at our type systems in ways that are more open to extension with new features. I would recommend looking at this paper even if you are interested in ML and type inference, but not subtyping, or in polymorphism and subtyping, but not type inference, or in subtyping and type inference, but not functional languages.

This paper is also a teaser for the first's author PhD thesis, *Algebraic Subtyping*, currently not available online -- I suppose the author is not very good at the internet.

(If you are looking for interesting work on inference of polymorphism and subtyping in object-oriented languages, I would recommend Getting F-Bounded Polymorphism into Shape by Ben Greenman, Fabian Muehlboeck and Ross Tate, 2014.)

### Salon des Refusés -- Dialectics for new computer science

In the first decade of the twenty-first century, the Feyerabend Project organized several workshops to discuss and develop new ways to think of programming languages and computing in general. A new event in this direction is a new workshop that will take place in Brussels, in April, co-located with the new <Programming> conference -- also worth a look.

Salon des Refusés -- Dialectics for new computer science

April 2017, Brussels

Salon des Refusés (“exhibition of rejects”) was an 1863 exhibition of artworks rejected from the official Paris Salon. The jury of Paris Salon required near-photographic realism and classified works according to a strict genre hierarchy. Paintings by many, later famous, modernists such as Édouard Manet were rejected and appeared in what became known as the Salon des Refusés. This workshop aims to be the programming language research equivalent of Salon des Refusés. We provide venue for exploring new ideas and new ways of doing computer science.

Many interesting ideas about programming might struggle to find space in the modern programming language research community, often because they are difficult to evaluate using established evaluation methods (be it proofs, measurements or controlled user studies). As a result, new ideas are often seen as “unscientific”.

This workshop provides a venue where such interesting and thought provoking ideas can be exposed to critical evaluation. Submissions that provoke interesting discussion among the program committee members will be published together with an attributed review that presents an alternative position, develops additional context or summarizes discussion from the workshop. This means of engaging with papers not just enables explorations of novel programming ideas, but also enourages new ways of doing computer science.

The workshop's webpage also contains descriptions of of some formats that could "make it possible to think about programming in a new way", including: Thought experiments, Experimentation, Paradigms, Metaphors, myths and analogies, and From jokes to science fiction.

For writings on similar questions about formalism, paradigms or method in programming language research, see Richard Gabriel's work, especially The Structure of a Programming Language Revolution (2012) and Writers’ Workshops As Scientific Methodology (?)), Thomas Petricek's work, especially Against a Universal Definition of 'Type' (2015) and Programming language theory Thinking the unthinkable (2016)), and Jonathan Edwards' blog: Alarmind Development.

For programs of events of similar inspiration in the past, you may be interested in the Future of Programming workshops: program of 2014, program of September 2015, program of October 2015. Other events that are somewhat similar in spirit -- but maybe less radical in content -- are Onward!, NOOL and OBT.

### Michael Snoyman: Haskell for Dummies

There was an image that made the rounds a while ago.

The joke being: haha, Haskell is only for super-geniuses like Einstein. There's lots to complain about in this chart, but I'm going to pick on the lower-right corner. Specifically:

**Haskellers don't use Haskell because we think we're Einstein. We use
Haskell because we know we aren't.**

When I speak to Haskellers, the general consensus is: "I'm not smart enough to write robust code in a language like Python." We're not using Haskell because we're brilliant; we're using Haskell because we know we need a language that will protect us from ourselves.

That said, I should acknowledge that Haskell does have a steeper learning curve for most programmers. But this is mostly to do with unfamiliarity: Haskell is significantly different from languages like Python, Ruby, and Java, whereas by contrast those languages are all relatively similar to each other. Great educational material helps with this.

You should set your expectations appropriately: it will take you longer to learn Haskell, but it's worth it. Personally, I use Haskell because:

- It gives me the highest degree of confidence that I'll write my program correctly, due to its strong, static typing
- It has great support for modern programming techniques, like functional programming and green-thread-based concurrency
- I can write more maintainable code in it than other languages
- It has a great set of libraries and tools
- It's got great performance characteristics for high-level code, and allows low-level performance tweaking when needed

I'm certainly leaving off a lot of points here, my goal isn't to be comprehensive. Instead, I'd like to dispel with this notion of the Haskeller super-genius. We Haskellers don't believe it. We know why we're using a language like Haskell: to protect us from ourselves.

### FP Complete: Scripting in Haskell

Writing scripts in Haskell using Stack is straight-forward and reliable. We've made a screencast to demonstrate this:

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="315" src="https://www.youtube.com/embed/UVek-DEc_pQ" width="560"></iframe>SummarySlides in the screencast cover:

- What Haskell is
- What Stack is
- We make a case for reproducible scripting

We cover the following example cases:

- Hello, World! as a Haskell script
- Generating a histogram of lines of a file
- Using FSNotify to watch a directory for any changes to files

In summary, we show:

- Scripting in Haskell is reproducible: it works over time without bitrot.
- It's easy to do, just add a shebang #!/usr/bin/env stack to the top of your file.
- We can use useful, practical libraries that are not available in traditional scripting languages.
- Our script is cross-platform: it'll work on OS X, Linux or Windows.

### FP Complete: Comparative Concurrency with Haskell

Last week, I was at DevConTLV X and attended a workshop by Aaron Cruz. While the title was a bit of a lie (it wasn't four hours, and we didn't do a chat app), it was a great way to see some basics of concurrency in different languages. Of course, that made me all the more curious to add Haskell to the mix.

I've provided multiple different implementations of this program in Haskell, focusing on different approaches (matching the approaches of the other languages, highly composable code, and raw efficiency). These examples are intended to be run and experimented with. The only requirement is that you install the Haskell build tool Stack. You can download a Windows installer, or on OS X and Linux run:

curl -sSL https://get.haskellstack.org/ | shWe'll start with approaches very similar to other languages like Go and Rust, and then dive into techniques like Software Transactional Memory which provide a much improved concurrency experience for more advanced workflows. Finally we'll dive into the async library, which provides some very high-level functions for writing concurrent code in a robust manner.

Unfortunately I don't have access to the source code for the other languages right now, so I can't provide a link to it. If anyone has such code (or wants to write up some examples for other lanugages), please let me know so I can add a link.

The problemWe want to spawn a number of worker threads which will each sleep for a random period of time, grab an integer off of a shared work queue, square it, and put the result back on a result queue. Meanwhile, a master thread will fill up the work queue with integers, and read and print results.

Running the examplesOnce you've installed Stack, you can save each code snippet into a file with a .hs extension (like concurrency.hs), and then run it with stack concurrency.hs. If you're on OS X or Linux, you can also:

chmod +x concurrency.hs ./concurrency.hsThe first run will take a bit longer as it downloads the GHC compiler and installs library dependencies, but subsequent runs will be able to use the cached results. You can read more about scripting with Haskell.

ChannelsMost of the other language examples used some form of channels. We'll begin with a channel-based implementation for a convenient comparison to other language implementations. As you'll see, Haskell's channel-based concurrency is quite similar to what you'd experience in languages like Go and Elixir.

#!/usr/bin/env stack -- stack --install-ghc --resolver lts-6.23 runghc --package random import Control.Concurrent (forkIO, threadDelay, readChan, writeChan, newChan) import Control.Monad (forever) import System.Random (randomRIO) -- The following type signature is optional. Haskell has type -- inference, which makes most explicit signatures unneeded. We -- usually include them at the top level for easier reading. workerCount, workloadCount, minDelay, maxDelay :: Int workerCount = 250 workloadCount = 10000 minDelay = 250000 -- in microseconds, == 0.25 seconds maxDelay = 750000 -- == 0.75 seconds -- Launch a worker thread. We take in the request and response -- channels to communicate on, as well as the ID of this -- worker. forkIO launches an action in a new thread, and forever -- repeats the given action indefinitely. worker requestChan responseChan workerId = forkIO $ forever $ do -- Get a random delay value between the min and max delays delay <- randomRIO (minDelay, maxDelay) -- Delay this thread by that many microseconds threadDelay delay -- Read the next item off of the request channel int <- readChan requestChan -- Write the response to the response channel writeChan responseChan (workerId, int * int) main = do -- Create our communication channels requestChan <- newChan responseChan <- newChan -- Spawn off our worker threads. mapM_ performs the given action -- on each value in the list, which in this case is the -- identifiers for each worker. mapM_ (worker requestChan responseChan) [1..workerCount] -- Define a helper function to handle each integer in the workload let perInteger int = do -- Write the current item to the request channel writeChan requestChan int -- Read the result off of the response channel (workerId, square) <- readChan responseChan -- Print out a little message putStrLn $ concat [ "Worker #" , show workerId , ": square of " , show int , " is " , show square ] -- Now let's loop over all of the integers in our workload mapM_ perInteger [1..workloadCount]This is a pretty direct translation of how you would do things in a language like Go or Erlang/Elixir. We've replaced for-loops with maps, but otherwise things are pretty similar.

There's a major limitation in this implementation, unfortunately. In the master thread, our perInteger function is responsible for providing requests to the workers. However, it will only provide one work item at a time and then block for a response. This means that we are effectively limiting ourselves to one concurrent request. We'll address this in various ways in the next few examples.

Compare-and-swapIt turns out in this case, we can use a lighter-weight alternative to a channel for the requests. Instead, we can just put all of our requests into an IORef - which is the basic mutable variable type in Haskell - and then pop requests off of the list inside that variable. Veterans of concurrency bugs will be quick to point out the read/write race condition you'd usually expect:

- Thread A reads the list from the variable
- Thread B reads the list from the variable
- Thread A pops the first item off the list and writes the rest to the variable
- Thread B pops the first item off the list and writes the rest to the variable

In this scenario, both threads A and B will end up with the same
request to work on, which is certainly *not* our desired
behavior. However, Haskell provides built-in compare-and-swap
functionality, allowing us to guarantee that our read and write are
*atomic operations*. This only works for a subset of Haskell
functionality (specifically, the *pure* subset which does not have I/O
side effects), which fortunately our pop-an-element-from-a-list falls
into. Let's see the code.

Compare-and-swap operations can be significantly more efficient than using true concurrency datatypes (like the Chans we saw above, or Software Transactional Memory). But they are also limiting, and don't compose nicely. Use them when you need a performance edge, or have some other reason to prefer an IORef.

Compared to our channels example, there are some differences in behavior:

- In the channels example, our workers looped forever, whereas here they have an explicit stop condition. In reality, the Haskell runtime will automatically kill worker threads that are blocked on a channel without any writer. However, we'll see how to use closable channels in a later example.
- The channels example would only allow one request on the request channel at a time. This is similar to some of the examples from other languages, but defeats the whole purpose of concurrency: only one worker will be occupied at any given time. This IORef approach allows multiple workers to have work items at once. (Again, we'll see how to achieve this with channels in a bit.)

You may be concerned about memory usage: won't holding that massive
list of integers in memory all at once be expensive? Not at all:
Haskell is a *lazy* language, meaning that the list will be
constructed on demand. Each time a new element is asked for, it will
be allocated, and then can be immediately garbage collected.

One of the most powerful concurrency techniques available in Haskell is Software Transactional Memory (STM). It allows us to have mutable variables, and to make modifications to them atomically. For example, consider this little snippet from a theoretical bank account application:

transferFunds from to amt = atomically $ do fromOrig <- readTVar from toOrig <- readTVar to writeTVar from (fromOrig - amt) writeTVar to (toOrig + amt)In typically concurrent style, this would be incredibly unsafe: it's
entirely possible for another thread to modify the from or to bank
account values between the time our thread reads and writes
them. However, with STM, we are guaranteed *atomicity*. STM will keep
a ledger of changes made during an atomic transaction, and then
attempt to commit them all at once. If any of the variables references
have modified during the transaction, the ledger will be rolled back
and tried again. And like atomicModifyIORef from above, Haskell
disallows side-effects inside a transaction, so that this retry
behavior cannot be observed from the outside world.

To stress this point: **Haskell's STM can eliminate the possibility
for race conditions and deadlocks from many common concurrency
patterns, greatly simplifying your code**. The leg-up that Haskell has
over other languages in the concurrency space is the ability to take
something that looks like calamity and make it safe.

We're going to switch our channels from above to STM channels. For the request channel, we'll use a bounded, closable channel (TBMChan). Bounding the size of the channel prevents us from loading too many values into memory at once, and using a closable channel allows us to tell our workers to exit.

#!/usr/bin/env stack {- stack --install-ghc --resolver lts-6.23 runghc --package random --package stm-chans -} import Control.Concurrent (forkIO, threadDelay, readChan, writeChan, newChan) import Control.Concurrent.STM (atomically, writeTChan, readTChan, newTChan) import Control.Concurrent.STM.TBMChan (readTBMChan, writeTBMChan, newTBMChan, closeTBMChan) import Control.Monad (replicateM_) import System.Random (randomRIO) workerCount = 250 workloadCount = 10000 minDelay = 250000 -- in microseconds, == 0.25 seconds maxDelay = 750000 -- == 0.75 seconds worker requestChan responseChan workerId = forkIO $ do let loop = do delay <- randomRIO (minDelay, maxDelay) threadDelay delay -- Interact with the STM channels atomically toContinue <- atomically $ do -- Get the next request, if the channel is open mint <- readTBMChan requestChan case mint of -- Channel is closed, do not continue Nothing -> return False -- Channel is open and we have a request Just int -> do -- Write the response to the response channel writeTChan responseChan (workerId, int, int * int) -- And yes, please continue return True if toContinue then loop else return () -- Kick it off! loop main = do -- Create our communication channels. We're going to ensure the -- request channel never gets more than twice the size of the -- number of workers to avoid high memory usage. requestChan <- atomically $ newTBMChan (workerCount * 2) responseChan <- atomically newTChan mapM_ (worker requestChan responseChan) [1..workerCount] -- Fill up the request channel in a dedicated thread forkIO $ do mapM_ (atomically . writeTBMChan requestChan) [1..workloadCount] atomically $ closeTBMChan requestChan replicateM_ workloadCount $ do -- Read the result off of the response channel (workerId, int, square) <- atomically $ readTChan responseChan -- Print out a little message putStrLn $ concat [ "Worker #" , show workerId , ": square of " , show int , " is " , show square ]Overall, this looked pretty similar to our previous channels, which isn't surprising given the relatively basic usage of concurrency going on here. However, using STM is a good default choice in Haskell applications, due to how easy it is to build up complex concurrent programs with it.

Address corner casesAlright, we've tried mirroring how examples in other languages work, given a taste of compare-and-swap, and explored the basics of STM. Now let's make our code more robust. The examples here - and those for other languages - often take some shortcuts. For example, what happens if one of the worker threads encounters an error? When our workload is simply "square a number," that's not a concern, but with more complex workloads this is very much expected.

Our first example, as mentioned above, didn't allow for true concurrency, since it kept the channel size down to 1. And all of our examples have made one other assumption: the number of results expected. In many real-world applications, one request may result in 0, 1, or many result values. So to sum up, let's create an example with the following behavior:

- If
*any*of the threads involved abort exceptionally, take down the whole computation, leaving no threads alive - Make sure that multiple workers can work in parallel
- Let the workers exit successfully when there are no more requests available
- Keep printing results until all worker threads exit.

We have one final tool in our arsenal that we haven't used yet: the async library, which provides some incredibly useful concurrency tools. Arguably, the most generally useful functions there are concurrently (which runs two actions in separate threads, as we'll describe in the comments below), and mapConcurrently, which applies concurrently over a list of values.

This example is how I'd recommend implementing this algorithm in practice: it uses solid library functions, accounts for exceptions, and is easy to extend for more complicated use cases.

#!/usr/bin/env stack {- stack --install-ghc --resolver lts-6.23 runghc --package random --package async --package stm-chans -} import Control.Concurrent (threadDelay) import Control.Concurrent.Async (mapConcurrently, concurrently) import Control.Concurrent.STM (atomically) import Control.Concurrent.STM.TBMChan (readTBMChan, writeTBMChan, newTBMChan, closeTBMChan) import System.Random (randomRIO) workerCount = 250 workloadCount = 10000 minDelay = 250000 -- in microseconds, == 0.25 seconds maxDelay = 750000 -- == 0.75 seconds -- Not meaningfully changed from above, just some slight style -- tweaks. Compare and contrast with the previous version if desired -- :) worker requestChan responseChan workerId = do let loop = do delay <- randomRIO (minDelay, maxDelay) threadDelay delay mint <- atomically $ readTBMChan requestChan case mint of Nothing -> return () Just int -> do atomically $ writeTBMChan responseChan (workerId, int, int * int) loop loop main = do -- Create our communication channels. Now the response channel is -- also bounded and closable. requestChan <- atomically $ newTBMChan (workerCount * 2) responseChan <- atomically $ newTBMChan (workerCount * 2) -- We're going to have three main threads. Let's define them all -- here. Note that we're _defining_ an action to be run, not -- running it yet! We'll run them below. let -- runWorkers is going to run all of the worker threads runWorkers = do -- mapConcurrently runs each function in a separate thread -- with a different argument from the list, and then waits -- for them all to finish. If any of them throw an -- exception, all of the other threads are killed, and -- then the exception is rethrown. mapConcurrently (worker requestChan responseChan) [1..workerCount] -- Workers are all done, so close the response channel atomically $ closeTBMChan responseChan -- Fill up the request channel, exactly the same as before fillRequests = do mapM_ (atomically . writeTBMChan requestChan) [1..workloadCount] atomically $ closeTBMChan requestChan -- Print each result printResults = do -- Grab a response if available mres <- atomically $ readTBMChan responseChan case mres of -- No response available, so exit Nothing -> return () -- We got a response, so... Just (workerId, int, square) -> do -- Print it... putStrLn $ concat [ "Worker #" , show workerId , ": square of " , show int , " is " , show square ] -- And loop! printResults -- Now that we've defined our actions, we can use concurrently to -- run all of them. This works just like mapConcurrently: it forks -- a thread for each action and waits for all threads to exit -- successfully. If any thread dies with an exception, the other -- threads are killed and the exception is rethrown. runWorkers `concurrently` fillRequests `concurrently` printResults return ()By using the high level concurrently and mapConcurrently functions, we avoid any possibility of orphaned threads, and get automatic exception handling and cancelation functionality.

Why HaskellAs you can see, Haskell offers many tools for advanced concurrency. At the most basic level, Chans and forkIO give you pretty similar behavior to what other languages provide. However, IORefs with compare-and-swap provide a cheap concurrency primitive not available in most other languages. And the combination of STM and the async package is a toolset that to my knowledge has no equal in other languages. The fact that side-effects are explicit in Haskell allows us to do many advanced feats that wouldn't be possible elsewhere.

We've only just barely scratched the surface of what you can do with Haskell. If you're interested in learning more, please check out our Haskell Syllabus for a recommended learning route. There's also lots of content on the haskell-lang get started page. And if you want to learn more about concurrency, check out the async tutorial.

FP Complete also provides corporate and group webinar training sessions. Please check out our training page for more information, or see our consulting page for how we can help your team succeed with devops and functional programming.

Advanced questionsWe skirted some more advanced topics above, but for the curious, let me address some points:

In our first example, we use forever to ensure that our workers would never exit. But once they had no more work to do, what happens to them? The Haskell runtime is smart enough to notice in general when a channel has no more writers, and will automatically send an asynchronous exception to a thread which is trying to read from such a channel. This works well enough for a demo, but is not recommended practice:

- It's possible, though unlikely, that the runtime system won't be able to figure out that your thread should be killed
- It's much harder to follow the logic of a program which has no explicit exit case
- Using exceptions for control flow is generally a risk endeavor, and in the worst case, can lead to very unexpected bugs

For the observant Haskeller, our definitions of runWorkers and fillRequests in the last example may look dangerous. Specifically: what happens if one of those actions throws an exception before closing the channel? The other threads reading from the channel will be blocked indefinitely! Well, three things:

- As just described, the runtime system will likely be able to kill the thread if needed
- However, because of the way we structured our program, it won't matter: if either of these actions dies, it will take down the others, so no one will end up blocked on a channel read
Nonetheless, I strongly recommend being exception-safe in all cases (I'm kind of obsessed with it), so a better way to implement these functions would be with finally, e.g.:

fillRequests = mapM_ (atomically . writeTBMChan requestChan) [1..workloadCount] `finally` atomically (closeTBMChan requestChan)

This post was explicitly about

*concurrency*, or running multiple I/O actions at the same time. I avoided talking about the very much related topic of parallelism, which is speeding up a computation by performing work on multiple cores. In other languages, the distinction between these is minor. In Haskell, with our separation between purity and impurity, parallelism can often be achieved with something as simple as replacing map with parMap (parallel map).That said, it's certainly possible - and common - to implement parallelism via concurrency. In order to make that work, we would need to force evaluation of the result value (int * int) before writing it to the channel. This could be achieved with something like:

let !result = int * int writeChan responseChan (workerId, result)The ! is called a bang pattern, and indicates that evaluation should be forced immediately.