News aggregator
Functional Jobs: Developer / DevOps Engineer at Klarna (Fulltime)
Our FRED teams are responsible for building and operating Klarna’s highly scalable and highly available purchaseaccepting system. By combining Erlang, RIAK and RabbitMQ this business critical, multinode, nomaster system handles our exponential growth of real time transactions. The teams have endtoend responsibility and believe in relentless automation, shipped code and reusability.
We are looking for a functional programmer that wants to take on some operations responsibilities and who believes that you should be responsible for what you create. The role will primarily involve developing infrastructure for logging, monitoring, testing, deployment and operations in order to support the system and contributing feature teams, but a willingness to understand the underlying businesslogic and when necessary contribute some feature development is also important. We are looking for someone who can grow into being a Hamiltonian salty mechanic in a mission critical helicopter, so to speak.
Required
 B.Sc/M.Sc in Computer Science
 experience with a functional programming language (Erlang, a LISP, an ML, Haskell, etc)
 comfortability with shell scripting
 extremely high attention to detail
 intellectual curiosity
 a preference for simple, elegant, and reusable solutions
 a getthingsdone attitude
Nice to have
 some experience with configuration management (Chef, Puppet, ansible, cfengine, etc)
 some exposure to Java, Erlang or Ruby
 experience with graphite, collectd
 experience with virtualization (CloudStack, AWS, etc)
 nagios/op5 experience
 experience with Splunk API
 RabbitMQ experience
 Riak experience (or experience with distributed databases NoSQL/SQL)
 proven technical writing ability
Would be cool
 history of giving entertaining technical talks at conferences
If you have any question please email Philip Alsén at: philip [dot] alsen [at] klarna [dot] com. Apply as soon as possible since we are working continuously with the recruitment process.
Location Stockholm, Sweden
Get information on how to apply for this position.
Erik de Castro Lopo: Moving from Wai 2.X to 3.0.
Michael Snoyman has just released version 3.0 of Wai, the Haskell Web Application Interface library which is used with the Yesod Web Framework and anything that uses the Warp web server. The important changes for Wai are listed this blog post. The tl;dr is that removing the Conduit library dependency makes the Wai interface more easily usable with one of the alternative Haskell streaming libraries, like Pipes, StreamIO, Iterator etc.
As a result of the above changes, the type of a web application changes as follows:
 Wai > 2.0 && Wai < 3.0 type Application = Request > IO Response  Wai == 3.0 type Application = Request > (Response > IO ResponseReceived) > IO ResponseReceivedTypically a function of type Application will be run by the Warp web server using one of Warp.run or associated functions which have type signatures of:
run :: Port > Application > IO () runSettings :: Settings > Application > IO () runSettingsSocket :: Settings > Socket > Application > IO ()Source runSettingsConnection :: Settings > IO (Connection, SockAddr) > Application > IO ()Its important to note that the only thing that has changed about these Warp functions is the Application type. That means that if we have a function oldWaiApplication that we want to interface to the new version of Wai, we can just wrap it with the following function:
newWaiApplication :: Manager > Request > (Response > IO ResponseReceived) > IO ResponseReceived newWaiApplication mgr wreq receiver = oldWaiApplication mgr wreq >>= receiverand use newWaiApplication in place of oldWaiApplication in the call to whichever of the Warp run functions you are using.
Actually, installing and updating GHC on Mac OS X is really easy
For a short while today, I was trying to figure out how to install GHC on a new OS X installation. There are tons of ways to do it. Too many, and most don't feel easily maintainable.
Then, I came across GHC for MAC OS X: http://ghcformacosx.github.io
Just what I'd been looking for! But wouldn't it be nice if you could automatically update it from the command line each time someone packages a new one (like brew)?
Well, turns out you can.
 Install homebrewcask ( http://gillesfabio.github.io/homebrewcaskhomepage/#installation )
 Run "brew cask install ghc" each time you want to install or upgrade ghc to the latest app
A pull request is in the process of upgrading their cask to r6, and with some community backing (or even some help from the author of ghcformacosx) we could easily keep it current and shiny.
submitted by kvanberendonck[link] [6 comments]
nonexhaustive pattern match(es)?
Keegan McAllister: On depression, privilege, and online activism
[Content warning: depression, privilege, online activism]
This isn't a general account of my experiences with depression. Many people have written about that, and I don't have much to add. But there's one aspect that I don't hear about very often. It's something that bothers me a lot, and others have told me that it bothers them too.
The thing is, I'm not just a person with a mental illness. I'm also a welloff white guy, and I enjoy a whole set of unearned privileges from that. Every day people around the world are harassed, abused, and killed over things I never have to worry about. Even in mundane daily life, most everyone is playing on a higher difficulty setting than I ever will.
I've thought about this a lot over the past few years, and I'm trying to understand how I can help make the world more fair and less oppressive. So I give money and I volunteer a little and I speak up when it seems useful, but mostly I listen. I listen to the experiences of people who are different from me. I try to get some understanding of how they feel and why.
How is this related to depression? Because the reality of privilege and oppression is fucking depressing. Of course it's depressing to those who are directly harmed. That's a lot of what I read about, and some of the despair transfers to me. But my profiting from the suffering of others in a way that I mostly can't change is also depressing, at least if I make an attempt not to ignore it.
And my distress over my role in systems of oppression brings its own layer of guilt. People are actually suffering and I feel sorry for myself because I'm dimly aware of it? But this comes from the voice that has always taunted me about depression. “How can you be sad? Your life is great. If you had real problems you wouldn't be so pathetic. You're not really sick. You're just a whiner.”
All of which is part of the disease. I need to own it and work on it every day. But it seems like every time I read an online discussion about social justice, I take a huge step backwards.
It's hard to shrug off the “men are horrible” comments when I spend so much effort trying to convince myself that I'm not horrible. When I hear people gloating about delicious white male tears, I think about all the times when I would come home from work and collapse in bed crying. Is this what they want my life to be?
I can't give myself permission to tune out, because the same people lecture constantly about my obligation to be a good ally, which mostly takes the form of “shut up and listen.” And then when I'm upset by the things they say, the response is “This isn't for you! Why are you listening?”
A local group, one that had recently invited me to hang out as a guest, retweeted a member's declaration to wouldbe allies: “We're not friends. Fuck you.” Can you see why it feels like they're trying to hurt me?
Let me be clear: I truly don't care if people in a room somewhere are talking about how men are the worst. I don't feel oppressed by it, and I have no desire to argue with it. But I can't handle direct exposure.
And don't tell me that I'm too stupid to understand why they say these things. I know intellectually that it's not about me. I understand the need to vent and the importance of building solidarity. None of that matters on the emotional level where these comments register like a punch to the gut. I do feel this way, even if I shouldn't and I wish I didn't.
I'm talking about mental health, triggers, and unintentionally hurtful speech. Does that sound familiar? One reason I was drawn to intersectional feminism is that it seemed to have a good set of ground rules for how to treat everyone decently. But now I feel like I'm excluded from protection. “Men are horrible” is apparently the one form of speech where intent is all that matters, and I'm a bad person if it triggers something. I've been told it's offensive that I would even try to describe my experience in those terms.
It hurts a whole lot to try and really feel someone's pain, and then realize they don't even slightly give a shit about me. It hurts even more when they'll bend over backwards for anyone except me.
Look, I get it. You argue all the time with trolls who claim that men have it just as bad as women and will shout “what about the men” as a way to disrupt any discussion. When you're engaged in meme warfare, you can't show them any human empathy. They certainly wouldn't return the favor. And if my voice sounds a little like theirs, that's just too bad for me.
I know that this article will serve as ammunition for some people with views I find disgusting. That sucks, but I'm done using political strategy as a reason to stay silent. I understand tone policing as a derailing tactic, and I understand the need to call it out. But at this point it seems there's no room for a sincere request for kindness, especially coming from someone who doesn't get much benefit of the doubt. (The Geek Feminism Wiki basically says that asking for kindness is tone policing if and only if you're a man.)
I'm not trying to silence anyone here. I'm not jumping in and derailing an existing conversation. I'm writing on my own blog, on my own schedule, about my own feelings. But I'm told that even this is crossing a line.
I know that I can't dictate how others feel about our fuckedup world. Does that mean I must absolutely suppress the way I feel? Even when we agree about the substance of what's wrong? I know that if I ask someone to share their life experiences, they have a right to express anger. When does expressing anger become sustained, deliberate cruelty?
“People are being oppressed and you're asking us to care about your feelings?” Yes, I am asking you to care. Just a little bit. I don't claim that my feelings should be a top priority. I hope it wouldn't come up very often. But according to the outspoken few who set the tone, I'm never allowed to bring it up. I don't deserve to ask them to be nice.
And that's why I can no longer have anything to do with this movement. It's really that simple. I guess it says something about my state of mind that I felt the need to attach 1,700 words of preemptive defenses.
The truth is, when I'm not allowed to say or even think “not all men,” part of me hears “Yes, all men, especially you.” And if I'm ever confused about whether I'm allowed to say “not all men,” there are a dozen unprompted reminders every day. Little jokes, repeated constantly to set the climate about what will and won't be tolerated.
When you treat me like one of the trolls, I start to believe that I am one. Guys who say “I support feminism but sometimes they go too far” are usually trying to excuse sexist behavior. So what do I conclude about myself when I have the same thought?
I get that “ally” is not a label you selfapply, it's a thing you do, and the label comes from others. The problem is, if a hundred people say I'm a good ally, and one person says I'm a sexist asshole, who do you think I'm going to believe?
I'm not allowed to stand up for myself, because doing so is automatically an act of oppression. If a woman treats me like shit, and she's being “more feminist” than me, I conclude that I deserve to be treated like shit. That is the model I've learned of a good ally.
I'm not a good ally, or even a bad one. I'm collateral damage.
If the point of all this is to give me a tiny little taste of the invalidation that others experience on a regular basis, then congratulations, it worked. You've made your point. Now that you've broken me, how can I possibly help you, when it seems like I'm part of the problem just by existing? It feels like all I can do is engage in emotional selfharm to repay the debt of how I was born.
I can't just take a break “until I feel better.” My depressive symptoms will always come and go, and some thoughts will reliably bring them back. I spent years reading about how the most important thing I can do, as a winner of the birth lottery, is to be an ally to marginalized people. And now I've realized that I'm too sick and weak to do it.
Even if I give up on being an ally, I can't avoid this subject. It affects a lot of my friends, and I feel even worse when I ask them not to talk about it around me. I don't want to silence anyone. At least I've mostly stopped using Twitter.
So this is how I feel, but I'm not sure anyone else can do anything about it. Really, most of the people I've talked to have been sympathetic. Maybe I need to learn not to let bullies get to me, even when they're bullying in service of a cause I support. They don't seem to get much pushback from the wider community, at any rate.
What gives me hope is, I recognize that my participation in the endless shouting online wasn't really useful to anyone. If I can let myself ignore all that, maybe I can recover some of my energy for other activities that actually help people.
That's all I have to say right now. Thank you for listening to me.
Bryan O'Sullivan: Win bigger statistical fights with a better jackknife
(Summary: I’ve developed some algorithms for a statistical technique called the jackknife that run in O(n) time instead of O(n2).)
In statistics, an estimation technique called “the jackknife” has been widely used for over half a century. It’s a mainstay for taking a quick look at the quality of an estimator of a sample. (An estimator is a summary function over a sample, such as its mean or variance.)
Suppose we have a noisy sample. Our first stopping point might be to look at the variance of the sample, to get a sense of how much the values in the sample “spread out” around the average.
If the variance is not close to zero, then we know that the sample is somewhat noisy. But our curiosity may persist: is the variance unduly influenced by a few big spikes, or is the sample consistently noisy? The jackknife is a simple analytic tool that lets us quickly answer questions like this. There are more accurate, sophisticated approaches to this kind of problem, but they’re not nearly so easy to understand and use, so the jackknife has stayed popular since the 1950s.
The jackknife is easy to describe. We take the original sample, drop the first value out, and calculate the variance (or whatever the estimator is) over this subsample. We repeat this, dropping out only the second value, and continue. For an original sample with n elements, we end up with a collection of n jackknifed estimates of all the subsamples, each with one element left out. Once we’re done, there’s an optional last step: we compute the mean of these jackknifed estimates, which gives us the jackknifed variance.
For example, suppose we have the sample [1,3,2,1]. (I’m going to write all my examples in Haskell for brevity, but the code in this post should be easy to port to any statistical language.)
The simplest way to compute variance is as follows:
var xs = (sum (map (^2) xs)  sum xs ^ 2 / n) / n where n = fromIntegral (length xs)Using this method, the variance of [1,3,2,1] is 0.6875.
To jackknife the variance:
var [1,3,2,1] == 0.6875  leave out each element in succession  (I'm using ".." to denote repeating expansions) var [ 3,2,1] == 0.6666.. var [1, 2,1] == 0.2222.. var [1,3, 1] == 0.8888.. var [1,3,2 ] == 0.6666..  compute the mean of the estimates over the subsamples mean [0.6666,0.2222,0.8888,0.6666] == 0.6111..Since 0.6111 is quite different than 0.6875, we can see that the variance of this sample is affected rather a lot by bias.
While the jackknife is simple, it’s also slow. We can easily see that the approach outlined above takes O(n2) time, which means that we can’t jackknife samples above a modest size in a reasonable amount of time.
This approach to the jackknife is the one everybody actually uses. Nevertheless, it’s possible to improve the time complexity of the jackknife for some important estimators from O(n2) to O(n). Here’s how.
Jackknifing the meanLet’s start with the simple case of the mean. Here’s the obvious way to measure the mean of a sample.
mean xs = sum xs / n where n = fromIntegral (length xs)And here are the computations we need to perform during the naive approach to jackknifing the mean.
 n = fromIntegral (length xs  1) sum [ 3,2,1] / n sum [1, 2,1] / n sum [1,3, 1] / n sum [1,3,2 ] / nLet’s decompose the sum operations into two triangles as follows, and see what jumps out:
sum [ 3,2,1] = sum [] + sum [3,2,1] sum [1, 2,1] = sum [1] + sum [2,1] sum [1,3, 1] = sum [1,3] + sum [1] sum [1,3,2 ] = sum [1,3,2] + sum []From this perspective, we’re doing a lot of redundant work. For example, to calculate sum [1,3,2], it would be very helpful if we could reuse the work we did in the previous calculation to calculate sum [1,3].
Prefix sumsWe can achieve our desired reuse of earlier work if we store each intermediate sum in a separate list. This technique is called prefix summation, or (if you’re a Haskeller) scanning.
Here’s the bottom left triangle of sums we want to calculate.
sum [] { + sum [3,2,1] } sum [1] { + sum [2,1] } sum [1,3] { + sum [1] } sum [1,3,2] { + sum [] }We can prefixsum these using Haskell’s standard scanl function.
>>> init (scanl (+) 0 [1,3,2,1]) [0,1,4,6] { e.g. [0, 0 + 1, 0 + 1 + 3, 0 + 1 + 3 + 2] }(We use init to drop out the final term, which we don’t want.)
And here’s the top right of the triangle.
{ sum [] + } sum [3,2,1] { sum [1] + } sum [2,1] { sum [1,3] + } sum [1] { sum [1,3,2] + } sum []To prefixsum these, we can use scanr, which scans “from the right”.
>>> tail (scanr (+) 0 [1,3,2,1]) [6,3,1,0] { e.g. [3 + 2 + 1 + 0, 2 + 1 + 0, 1 + 0, 0] }(As in the previous case, we use tail to drop out the first term, which we don’t want.)
Now we have two lists:
[0,1,4,6] [6,3,1,0]Next, we sum the lists pairwise, which gives get exactly the sums we need:
sum [ 3,2,1] == 0 + 6 == 6 sum [1, 2,1] == 1 + 3 == 4 sum [1,3, 1] == 4 + 1 == 5 sum [1,3,2 ] == 6 + 0 == 6Divide each sum by n1, and we have the four subsample means we were hoping for—but in linear time, not quadratic time!
Here’s the complete method for jackknifing the mean in O(n) time.
jackknifeMean :: Fractional a => [a] > [a] jackknifeMean xs = map (/ n) $ zipWith (+) (init (scanl (+) 0 xs)) (tail (scanr (+) 0 xs)) where n = fromIntegral (length xs  1)If we’re jackknifing the mean, there’s no point in taking the extra step of computing the mean of the jackknifed subsamples to estimate the bias. Since the mean is an unbiased estimator, the mean of the jackknifed means should be the same as the sample mean, so the bias will always be zero.
However, the jackknifed subsamples do serve a useful purpose: each one tells us how much its corresponding leftout data point affects the sample mean. Let’s see what this means.
>>> mean [1,3,2,1] 1.75The sample mean is 1.75, and let’s see which subsample mean is farthest from this value:
>>> jackknifeMean [1,3,2,1] [2, 1.3333, 1.6666, 2]So if we left out 1 from the sample, the mean would be 2, but if we left out 3, the mean would become 1.3333. Clearly, this is the subsample mean that is farthest from the sample mean, so 3 is the most significant outlier in our estimate of the mean.
Prefix sums and varianceLet’s look again at the naive formula for calculating variance:
var xs = (sum (map (^2) xs)  sum xs ^ 2 / n) / n where n = fromIntegral (length xs)Since this approach is based on sums, it looks like maybe we can use the same prefix summation technique to compute the variance in O(n) time.
Because we’re computing a sum of squares and an ordinary sum, we need to perform two sets of prefix sum computations:
Two to compute the sum of squares, one from the left and another from the right
And two more for computing the square of sums
If we look closely, buried in the local function var above, we will see almost exactly the naive formulation for variance, only constructed from the relevant pieces of our four prefix sums.
Skewness, kurtosis, and moreExactly the same prefix sum approach applies to jackknifing higher order moment statistics, such as skewness (lopsidedness of the distribution curve) and kurtosis (shape of the tails of the distribution).
Numerical accuracy of the jackknifed meanWhen we’re dealing with a lot of floating point numbers, the ever present concerns about numerical stability and accuracy arise.
For example, suppose we compute the sum of ten million pseudoqrandom floating point numbers between zero and one.
The most accurate way to sum numbers is by first converting them to Rational, summing, then converting back to Double. We’ll call this the “true sum”. The standard Haskell sum function (“basic sum” below) simply adds numbers as it goes. It manages 14 decimal digits of accuracy before losing precision.
true sum: 5000754.656937315 basic sum: 5000754.65693705 ^However, Kahan’s algorithm does even better.
true sum: 5000754.656937315 kahan sum: 5000754.656937315If you haven’t come across Kahan’s algorithm before, it looks like this.
kahanStep (sum, c) x = (sum', c') where y = x  c sum' = sum + y c' = (sum'  sum)  yThe c term maintains a running correction of the errors introduced by each addition.
Naive summation seems to do just fine, right? Well, watch what happens if we simply add 1010 to each number, sum these, then subtract 1017 at the end.
true sum: 4999628.983274754 basic sum: 450000.0 kahan sum: 4999632.0 ^The naive approach goes completely off the rails, and produces a result that is off by an order of magnitude!
This catastrophic accumulation of error is often cited as the reason why the naive formula for the mean can’t be trusted.
mean xs = sum xs / n where n = fromIntegral (length xs)Thanks to Don Knuth, what is usually suggested as a replacement is Welford’s algorithm.
import Data.List (foldl') data WelfordMean a = M !a !Int deriving (Show) welfordMean = end . foldl' step zero where end (M m _) = m step (M m n) x = M m' n' where m' = m + (x  m) / fromIntegral n' n' = n + 1 zero = M 0 0Here’s what we get if we compare the three approaches:
true mean: 0.49996289832747537 naive mean: 0.04500007629394531 welford mean: 0.4998035430908203Not surprisingly, the naive mean is worse than useless, but the longrespected Welford method only gives us three decimal digits of precision. That’s not so hot.
More accurate is the Kahan mean, which is simply the sum calculated using Kahan’s algorithm, then divided by the length:
true mean: 0.49996289832747537 kahan mean: 0.4999632 welford mean: 0.4998035430908203This at least gets us to five decimal digits of precision.
So is the Kahan mean the answer? Well, Kahan summation has its own problems. Let’s try out a test vector.
 originally due to Tim Peters >>> let vec = concat (replicate 1000 [1,1e100,1,1e100])  accurate sum >>> sum (map toRational vec) 2000  naive sum >>> sum vec 0.0  Kahan sum >>> foldl kahanStep (S 0 0) vec S 0.0 0.0Ugh, the Kahan algorithm doesn’t do any better than naive addition. Fortunately, there’s an even better summation algorithm available, called the KahanBabuškaNeumaier algorithm.
kbnSum = uncurry (+) . foldl' step (0,0) where step (sum, c) x = (t, c') where c'  abs sum >= abs x = c + ((sum  t) + x)  otherwise = c + ((x  t) + sum) t = sum + xIf we try this on the same test vector, we taste sweet success! Thank goodness!
>>> kbnSum vec 2000.0Not only is KahanBabuškaNeumaier (let’s call it “KBN”) more accurate than Welford summation, it has the advantage of being directly usable in our desired prefix sum form. We’ll accumulate floating point error proportional to O(1) instead of the O(n) that naive summation gives.
Poor old Welford’s formula for the mean just can’t get a break! Not only is it less accurate than KBN, but since it’s a recurrence relation with a divisor that keeps changing, we simply can’t monkeywrench it into suitability for the same prefixsum purpose.
Numerical accuracy of the jackknifed varianceIn our jackknifed variance, we used almost exactly the same calculation as the naive variance, merely adjusted to prefix sums. Here's the plain old naive variance function once again.
var xs = (sum (map (^2) xs)  sum xs ^ 2 / n) / n where n = fromIntegral (length xs)The problem with this algorithm arises as the size of the input grows. These two terms are likely to converge for large n:
sum (map (^2) xs) sum xs ^ 2 / nWhen we subtract them, floating point cancellation leads to a large error term that turns our result into nonsense.
The usual way to deal with this is to switch to a twopass algorithm. (In case it’s not clear at first glance, the first pass below calculates mean.)
var2 xs = (sum (map (^2) ys)  sum ys ^ 2 / n) / n where n = fromIntegral (length xs) ys = map (subtract (mean xs)) xsBy subtracting the mean from every term, we keep the numbers smaller, so the two sum terms are less likely to converge.
This approach poses yet another conundrum: we want to jackknife the variance. If we have to correct for the mean to avoid cancellation errors, do we need to calculate each subsample mean? Well, no. We can get away with a cheat: instead of subtracting the subsample mean, we subtract the sample mean, on the assumption that it’s “close enough” to each of the subsample means to be a good enough substitute.
So. To calculate the jackknifed variance, we use KBN summation to avoid a big cumulative error penalty during addition, subtract the sample mean to avoid cancellation error when subtracting the sum terms, and then we’ve finally got a pretty reliable floating point algorithm.
Where can you use this?The jackknife function in the Haskell statistics library uses all of these techniques where applicable, and the Sum module of the mathfunctions library provides reliable summation (including secondorder KahanBabuška summation, if you gotta catch all those least significant bits).
(If you’re not already bored to death of summation algorithms, take a look into pairwise summation. It’s less accurate than KBN summation, but claims to be quite a bit faster—claims I found to be only barely true in my benchmarks, and not worth the loss of precision.)
Timeplotters: two data visualization tools for analysis of logs and time series
Roman Cheplyaka: The problem with mtl
This article starts a series in which I am going to publish my thoughts on and experience with different «extensible effects» approaches. This one in particular will explain the problem with the classic mtl approach that motivates us to explore extensible effects in the first place.
How transformer stacks are bornOften we start with a single monad — perhaps Reader or State. Then we realize it would be nice to add more to it — other ReaderTs or StateTs, probably an EitherT etc.
At that point writing the whole stack in a type signature becomes rather onerous. So we create a type alias for it, or even a newtype, to improve type error messages. At first it looks like a good idea — we have «the monad» for our application. It removes a lot of the cognitive overhead — all our internal APIs are structured around this monad. The more time we spend working on our application, the more useful functions we invent that are automatically compatible and composable; the more joy it becomes to write code.
At least this is how I used to structure my code. I learned this approach from xmonad, the first «serious» Haskell project I studied and took part in. It has the X monad, and all the functions work in and/or with this monad.
Concrete stacks are too rigidThis approach breaks, however, once we want to have multiple applications based on the same code. At work, for instance, I’d like to reuse a significant part of code between the real application, the simulator (kind of a REPL for our messaging campaigns) and tests. But those necessarily use different monad stacks! The simulator doesn’t deal with MySQL and RabbitMQ connections; the server doesn’t need to be able to travel back and forth in time, like our simulator does; and tests for a piece of functionality should ideally use the smallest stack that’s necessary for that functionality.
So we should abstract in some way from the monad stack.
mtl’s classesOne such abstraction comes directly from mtl, the monad transformers library.
If we simply write
{# LANGUAGE NoMonomorphismRestriction #} import Control.Monad.State import Control.Monad.Reader foo = do x < ask put $ fromEnum $ not xwithout supplying any type signature, then the inferred type will be
foo :: (MonadReader Bool m, MonadState Int m) => m ()This type signature essentially says that foo is a monadic computation which has two effects: reading a boolean value and reading/writing an integral value. These effects are handled by the familiar «handlers» runState and runReader.
We can combine any such computations together, and the type system will automaticaly figure out the total set of effects, in the form of class constraints. E.g. if we also have
bar :: (MonadState Int m, MonadWriter All m) => m ()then
(do foo; bar) :: (MonadReader Bool m, MonadState Int m, MonadWriter All m) => m ()So it looks like mtl can provide us with everything that the «extensible effects» approach promises. Or does it?
The limitationUnfortunately, if we write something a little bit different, namely
{# LANGUAGE NoMonomorphismRestriction #} import Control.Monad.State import Control.Monad.Reader foo = do x < get put $ fromEnum $ not xwhere we’ve changed ask to get, the compiler gets confused:
test.hs:6:3: No instance for (Monad m) arising from a do statement Possible fix: add (Monad m) to the context of the inferred type of foo :: m () In a stmt of a 'do' block: x < get In the expression: do { x < get; put $ fromEnum $ not x } In an equation for ‘foo’: foo = do { x < get; put $ fromEnum $ not x } test.hs:6:8: No instance for (MonadState Bool m) arising from a use of ‘get’ In a stmt of a 'do' block: x < get In the expression: do { x < get; put $ fromEnum $ not x } In an equation for ‘foo’: foo = do { x < get; put $ fromEnum $ not x } test.hs:7:3: No instance for (MonadState Int m) arising from a use of ‘put’ In the expression: put In a stmt of a 'do' block: put $ fromEnum $ not x In the expression: do { x < get; put $ fromEnum $ not x }This is because mtl asserts, via a mechanism called functional dependency, that a monadic stack can have only once instance of MonadState. Because get and put in the above example operate with different types of state, that code is invalid.
Merging transformer layersSince we can’t have multiple different MonadState constraints for our reusable monadic computation, we need to merge all StateT layers in order to be able to access them through the MonadState class:
data MyState = MyState { _sInt :: Int , _sBool :: Bool }Then we could generate lenses and put them in a class to achieve modularity:
class Has f t where hasLens :: Lens t f foo :: (MonadState s m, Has Int s, Has Bool s) => m ()The drawbacks of this approach are:
 It is boilerplateheavy, requiring an instance per field and a record per stack. When you need to convert between these records, it can be quite annoying.
 Since monad transformers don’t commute in general, you can’t always merge two StateT layers together. For instance, there’s no way to achieve the semantics of StateT s1 (MaybeT (StateT s2 Identity)) using only one layer of StateT.
mtl’s classes almost provide a valid «extensible effects» implementation, if not for the functional dependency that lets us have only single MonadState instance per stack.
In the subsequent articles we’ll explore ways to address this limitation.
Mike Izbicki: A neat trick for partially closed type families
This post covers a pretty neat trick with closed type families. Normal type families are “open” because any file can add new instances of the type. Closed type families, however, must be defined in a single file. This lets the type checker make more assumptions, and so the closed families are more powerful. In this post, we will circumvent this restriction and define certain closed type families over many files.
We only need these two language extensions for the technique:
> {# LANGUAGE TypeFamilies #} > {# LANGUAGE UndecidableInstances #}But for our motivating example, we’ll also use these extensions and some basic imports:
> {# LANGUAGE KindSignatures #} > {# LANGUAGE MultiParamTypeClasses #} > {# LANGUAGE ConstraintKinds #} > import Data.Proxy > import GHC.ExtsLet’s begin.
Consider the classes:
> class Param_a (p :: * > Constraint) t > class Param_b (p :: * > Constraint) t > class Param_c (p :: * > Constraint) t > class Base tThese classes can be chained together like so:
> type Telescope_abc = Param_a (Param_b (Param_c Base))It is easy to write a type family that returns the “head” of this list. On a telescope, the lens closest to you is called the eye piece, so that’s what we’ll call our type family:
> type family EyePiece ( p :: * > Constraint ) :: * > Constraint > type instance EyePiece (Param_a p) = Param_a Base > type instance EyePiece (Param_b p) = Param_b Base > type instance EyePiece (Param_c p) = Param_c BaseAgain, this type family is “open” because new instances can be defined in any file.
We might use this EyePiece type family as:
ghci> :t Proxy :: Proxy (EyePiece Telescope_abc) :: Proxy (Param_a Base)Now, let’s try to write a type class that does the opposite. Instead of extracting the first element in the chain, it will extract the last. On a telescope the lens farthest away from you is called the objective, so that’s what we’ll call our type family. We’ll also need to define it as a closed type family:
type family Objective (lens :: * > Constraint) :: * > Constraint where Objective (Param_a p) = Objective p Objective (Param_b p) = Objective p Objective (Param_c p) = Objective p Objective (Param_a Base) = Param_a Base Objective (Param_b Base) = Param_b Base Objective (Param_c Base) = Param_c BaseWe can use the Objective family like:
ghci> :t Proxy :: Proxy (Objective Telescope_abc) :: Proxy (Param_c Base)The Objective family must be closed. This is because the only way to identify when we are at the end of the telescope is by checking if the p parmaeter is the Base class. If it is, then we’re done. If not, we must keep moving down the telescope recusively. Without a closed type family, we would have to explicitly list all of the recursive paths. This means type instances whenever we want to add a new Param_xxx class. That’s nasty and error prone.
Again, the downside of closed type families is that they must be defined all in one place. We can work around this limitation by “factoring” the closed type family into a collection of closed and open type families. In the example above, this works out to be:
> type family Objective (lens :: * > Constraint) :: * > Constraint > type instance Objective (Param_a p) = Objective_Param_a (Param_a p) > type instance Objective (Param_b p) = Objective_Param_b (Param_b p) > type instance Objective (Param_c p) = Objective_Param_c (Param_c p) > type instance Objective Base = Base > type family Objective_Param_a (lens :: * > Constraint) :: * > Constraint where > Objective_Param_a (Param_a Base) = Param_a Base > Objective_Param_a (Param_a p) = Objective p > type family Objective_Param_b (lens :: * > Constraint) :: * > Constraint where > Objective_Param_b (Param_b Base) = Param_b Base > Objective_Param_b (Param_b p) = Objective p > type family Objective_Param_c (lens :: * > Constraint) :: * > Constraint where > Objective_Param_c (Param_c Base) = Param_c Base > Objective_Param_c (Param_c p) = Objective p ghci> :t Proxy :: Proxy (Objective Telescope_abc) :: Proxy (Param_c Base)With this factoring, we are able to define the Objective instance for each Param_xxx in separate files and retain the benefits of closed type families.
Here is another example. The RemoveObjective family acts like the init function from the Prelude:
> type family RemoveObjective (lens :: * > Constraint) :: * > Constraint > type instance RemoveObjective (Param_a p) = RemoveObjective_Param_a (Param_a p) > type instance RemoveObjective (Param_b p) = RemoveObjective_Param_b (Param_b p) > type instance RemoveObjective (Param_c p) = RemoveObjective_Param_c (Param_c p) > type family RemoveObjective_Param_a (lens :: * > Constraint) :: * > Constraint where > RemoveObjective_Param_a (Param_a Base) = Base > RemoveObjective_Param_a (Param_a p) = Param_a (RemoveObjective p) > type family RemoveObjective_Param_b (lens :: * > Constraint) :: * > Constraint where > RemoveObjective_Param_b (Param_b Base) = Base > RemoveObjective_Param_b (Param_b p) = Param_b (RemoveObjective p) > type family RemoveObjective_Param_c (lens :: * > Constraint) :: * > Constraint where > RemoveObjective_Param_c (Param_c Base) = Base > RemoveObjective_Param_c (Param_c p) = Param_b (RemoveObjective p) ghci> :t Proxy :: Proxy (RemoveObjective Telescope_abc) :: Proxy (Param_a (Param_b Base))Of course, you can’t do this trick with every closed type family. For example, the RemoveObjective_Param_c family above cannot be factored any smaller. But if you find yourself wanting the benefits of both closed and open type families, then your type probably has the needed structure.
JOB: AHRC Doctoral Studentship in ComputationalMusicology
Brent Yorgey: AC and equivalence of categories
This is part three in a series of posts on avoiding the axiom of choice (part one, part two).
In my previous post, I explained one place where the axiom of choice often shows up in category theory, namely, when defining certain functors whose action on objects is specified only up to unique isomorphism. In this post, I’ll explain another place AC shows up, when talking about equivalence of categories. (Actually, as we’ll see, it’s really the same underlying issue, of defining a functor defined only up to unique isomorphism; this is just a particularly important instantiation of that issue.)
When are two categories “the same”? In traditional category theory, founded on set theory, there are quite a few different definitions of “sameness” for categories. Ultimately, this comes down to the fact that set theory does not make a very good foundation for category theory! There are lots of different ideas of equivalence, and they often do not correspond to the underlying equality on sets, so one must carefully pick and choose which notions of equality to use in which situations (and some choices might be better than others!). Every concept, it seems, comes with “strict” and “weak” variants, and often many others besides. Maintaining the principle of equivalence requires hard work and vigilence.
As an example, consider the following definition, our first candidate for the definition of “sameness” of categories:
Two categories and are isomorphic if there are functors and such that and .
Seems pretty straightforward, right? Well, this is the right idea in general, but it is subtly flawed. In fact, it is somewhat “evil”, in that it talks about equality of functors ( and must be equal to the identity). However, two functors and can be isomorphic without being equal, if there is a natural isomorphism between them—that is, a pair of natural transformations and such that and are both equal to the identity natural transformation.1 For example, consider the Haskell functors given by
data Rose a = Node a [Rose a] data Fork a = Leaf a  Fork (Fork a) (Fork a)These are obviously not equal, but they are isomorphic, in the sense that there are natural transformations (i.e. polymorphic functions) rose2fork :: forall a. Rose a > Fork a and fork2rose :: forall a. Fork a > Rose a such that rose2fork . fork2rose === id and fork2rose . rose2fork === id (showing this is left as an exercise for the interested reader).
Here, then, is a better definition:
Categories and are equivalent if there are functors and which are inverse up to natural isomorphism, that is, there are natural isomorphisms and .
So the compositions of the functors and do not literally have to be the identity functor, but only (naturally) isomorphic to it. This does turn out to be a wellbehaved notion of sameness for categories (although you’ll have to take my word for it).
The story doesn’t end here, however. In set theory, a function is a bijection—that is, an isomorphism of sets—if and only if it is both injective and surjective. By analogy, one might wonder what properties a functor must have in order to be one half of an equivalence. This leads to the following definition:
is protoequivalent2 to if there is a functor which is full and faithful (i.e., a bijection on each homset) as well as essentially surjective, that is, for every object there exists some object such that .
Intuitively, this says that “embeds” an entire copy of into (that’s the “full and faithful” part), and that every object of which is not directly in the image of is isomorphic to one that is. So every object of is “included” in the image of , at least up to isomorphism (which, remember, is supposed to be all that matters).
So, are equivalence and protoequivalence the same thing? In one direction, it is not too hard to show that every equivalence is a protoequivalence: if and are inverseuptonaturalisomorphism, then they must be fully faithful and essentially surjective. It would be nice if the converse were also true: in that case, in order to prove two categories equivalent, it would suffice to construct a single functor from one to the other, and show that has the requisite properties. This often ends up being more convenient than explicitly constructing two functors and showing they are inverse. However, it turns out that the converse is provable only if one accepts the axiom of choice!
To get an intuitive sense for why this is, suppose is fully faithful and essentially surjective. To construct an equivalence between and , we must define a functor and show it is inverse to (up to natural isomorphism). However, to define we must give its action on each object , that is, we must exhibit a function . We know that for each there exists some object such that . That is,
is a collection of nonempty sets. However, in a nonconstructive logic, knowing these sets are nonempty does not actually give us any objects! Instead, we have to use the axiom of choice, which gives us a choice function , and we can use this function as the object mapping of the functor .
So AC is required to prove that every protoequivalence is an equivalence. In fact, the association goes deeper yet: it turns out that the statement “every protoequivalence is an equivalence” (let’s call this the Axiom of Protoequivalence, or AP for short) not only requires AC, but is equivalent to it—that is, you can also derive AC given AP as an axiom!
On purely intuitive grounds, however, I would wager that to (almost?) anyone with sufficient category theory experience, it “feels” like AP “ought to be” true. If there is a full, faithful, and essentially surjective functor , then and “ought to be” equivalent. The particular choice of functor “doesn’t matter”, since it makes no difference up to isomorphism. On the other hand, we certainly don’t want to accept the axiom of choice. This puts us in the very awkward and inconsistent position of having two logically equivalent statements which we want to respectively affirm and reject. A fine pickle indeed! What to do?
There are four options (that I know of, at least):
 If one is feeling particularly rational, one can simply say, “Well, since AC and AP are equivalent, and I reject AC, I must therefore reject AP as well; my feelings about it are irrelevant.”
This is a perfectly sensible and workable approach. It’s important to highlight, therefore, that the “problem” is in some sense more a philosophical problem than a technical one. One can perfectly well adopt the above solution and continue to do category theory; it just may not be the “nicest” (a philosophical rather than technical notion!) way to do it.
We can therefore also consider some more creative solutions!

In a classical setting, one can avoid AC and affirm (an analogue of) AP by generalizing the notion of functor to that of anafunctor (Makkai 1996). Essentially, an anafunctor is a functor “defined only up to unique isomorphism”. It turns out that the appropriate analogue of AP, where “functor” has been replaced by “anafunctor”, is indeed true—and neither requires nor implies AC. Anafunctors “act like” functors in a sufficiently strong sense that one can simply do category theory using anafunctors in place of functors. However, one also has to replace natural transformations with “ananatural transformations”, etc., and it quickly gets rather fiddly.

In a constructive setting, a witness of essential surjectivity is necessarily a function which gives an actual witness , along with a proof that , for each . In other words, a constructive witness of essential surjectivity is already a “choice function”, and an inverse functor can be defined directly, with no need to invoke AC and no need for anafunctors. So in constructive logic, AP is simply true. However, this version of “essential surjectivity” is rather strong, in that it forces you to make choices you might prefer not to make: for each there might be many isomorphic to choose from, with no “canonical” choice, and it is annoying (again, a philosophical rather than technical consideration!) to be forced to choose one.

Instead of generalizing functors, a more direct solution is to generalize the notion of equality. After all, what really seems to be at the heart of all these problems is differing notions of equality (i.e. equality of sets vs isomorphism vs equivalence…). This is precisely what is done in homotopy type theory (Univalent Foundations Program 2013).3 It turns out that if one builds up suitable notions of category theory on top of HoTT instead of set theory, then (a) AP is true, (b) without the need for AC, (c) even with a weaker version of essential surjectivity that corresponds more closely to essential surjectivity in classical logic.4 This is explained in Chapter 9 of the HoTT book.
I plan to continue writing about these things in upcoming posts, particularly items (2) and (4) above. (If you haven’t caught on by now, I’m essentially blogging parts of my dissertation; we’ll see how far I get before graduating!) In the meantime, feedback and discussion are very welcome!
ReferencesMakkai, Michael. 1996. “Avoiding the Axiom of Choice in General Category Theory.” Journal of Pure and Applied Algebra 108 (2). Elsevier: 109–73.
Univalent Foundations Program, The. 2013. Homotopy Type Theory: Univalent Foundations of Mathematics. Institute for Advanced Study: http://homotopytypetheory.org/book.

The astute reader may well ask: but how do we know this is a nonevil definition of isomorphism between functors? Is it turtles all the way down (up)? This is a subtle point, but it turns out that it is not evil to talk about equality of natural transformations, since for the usual notion of category there is no higher structure after natural transformations, i.e. no nontrivial morphisms (and hence no nontrivial isomorphisms) between natural transformations. (However, you can have turtles all the way up if you really want.)↩

I made this term up, since there is no term in standard use: of course, if you accept AC, there is no need for a separate term at all!↩

As a historical note, it seems that the original work on anafunctors is part of the same intellectual thread that led to the development of HoTT.↩

That is, using propositional truncation to encode the classical notion of “there exists”.↩
System.Directory.removeDirectoryRecursive and symlinks
Robert Harper: Bellman Confirms A Suspicion
As is by now wellknown, I regard the supposed opposition between static and dynamic languages as a fallacy: the latter, being a special case of the former, can scarcely be an alternative to it. I cannot tell you how many times I’ve been handed arguments along the lines of “oh, static languages are just fine, but I want something more dynamic,” the speaker not quite realizing the absurdity of what they are saying. Yet somehow this sort of argument has an appeal, and I’ve often wondered why. I think it’s mostly just semantic innocence, but I’ve long had the suspicion that part of it is that it sounds good to be dynamic (active, outgoing, nimble) rather than static (passive, boring, staid). As we all know, much of the popularity of programming languages comes down to such superficialities and misunderstandings, so what else is new?
Well, nothing, really, except that I recently learned (from Guy Blelloch) the origin of the notably inapt term dynamic programming for a highly useful method of memoization invented by Richard Bellman that is consonant with my suspicion. Bellman, it turns out, had much the same thought as mine about the appeal of the word “dynamic”, and used it consciously to further his own ends:
“I spent the Fall quarter (of 1950) at RAND. My first task was to find a name for multistage decision processes.
“An interesting question is, ‘Where did the name, dynamic programming, come from?’ The 1950s were not good years for mathematical research. We had a very interesting gentleman in Washington named Wilson. He was Secretary of Defense, and he actually had a pathological fear and hatred of the word, research. I’m not using the term lightly; I’m using it precisely. His face would suffuse, he would turn red, and he would get violent if people used the term, research, in his presence. You can imagine how he felt, then, about the term, mathematical. The RAND Corporation was employed by the Air Force, and the Air Force had Wilson as its boss, essentially. Hence, I felt I had to do something to shield Wilson and the Air Force from the fact that I was really doing mathematics inside the RAND Corporation. What title, what name, could I choose? In the first place I was interested in planning, in decision making, in thinking. But planning, is not a good word for various rea sons. I decided therefore to use the word, ‘programming.’ I wanted to get across the idea that this was dynamic, this was multistage, this was timevarying—I thought, let’s kill two birds with one stone. Let’s take a word that has an absolutely precise meaning, namely dynamic, in the classical physical sense. It also has a very interesting property as an adjective, and that is it’s impossible to use the word, dynamic, in a pejorative sense. Try thinking of some combination that will possibly give it a pejorative meaning. It’s impossible. Thus, I thought dynamic programming was a good name. It was something not even a Congressman could object to. So I used it as an umbrella for my activities” (p. 159).
Hilarious, or what? It explains a lot, I must say, and confirms a longstanding suspicion of mine about the persistent belief in a nonexistent opposition.
Update: does anyone know why we say “memoization” rather than “memorization”?
Filed under: Research, Teaching Tagged: dynamic and static typing
Knearest neighbors in Haskell is elegant... but slow
I was intrigued by the concise implementation of Knearest neighbors in Rust (and separately OCaml, F# and D), located here and posted on Hacker News. I translated it into Haskell. The translation was fairly quick work, the resulting code elegant and about 40% smaller, and as expected, as soon as it compiled, I got the expected answer (~94.4%). However, the performance was abysmal. With O2 optimizations enabled, the compiled binary took about 2 minutes of wall clock time to run, compared to ~34 seconds for Rust and ~12 for OCaml.
I'm not an expert at performant Haskell, but I took what I understand to be the basic steps: I used strictness annotations in performancecritical parts; I used ByteStrings instead of strings; I didn't have any type class functions. The only thing that jumps out at me as a possible performance drain is using lists, but there's no random access in the code, so I don't think that would make much of a difference. Another potential source of drain is the minimumBy function, which may or may not be efficient. I'm not sure how much worse it would be in idiomatic Haskell, using Strings and no strictness.
My implementation is here. I welcome comments on how to do it better. It's certainly disappointing.
Edit: Using arrays provides a HUGE improvement (thanks skew!). This code runs in about 12 seconds on my laptop, a 10X speedup.
Edit 2: Parallelization, LLVM, and other improvements have brought the time down to 2 seconds! A 60X speedup. http://lpaste.net/105456
submitted by thinkpad20[link] [20 comments]