# News aggregator

### Accumulating and threading output of functions

### announce: arithmoi 0.4.2.0

### New version of itself

### Month in Haskell Mode May 2016

### Feedback on -Wredundant-constraints

### Jasper Van der Jeugt: Cryptographic Hashes and Cyclic Dependencies

I recently wrote a blog entry on the Fugue blog, about the problems that arise when you want to compute cryptographic hashes of resources that can have dependency cycles. You can read it here. Thanks to Jared Tobin, Maciej Woś and Racquel Yerbury for proofreading and editing.

### 10 open roles in expanding team at Standard Chartered

### PPPJ 2016, Submission Deadline Extended to June 13 AoE

### Don Stewart (dons): Multiple Haskell developer roles in Strats at Standard Chartered

The Strats team at Standard Chartered has another **ten** **open positions** for typed functional programming developers, based in London or Singapore. Strats are a specialized software engineering and quantitative analysis team who build a broad range of software for financial markets users at Standard Chartered.

You will work on the trading floor, directly with traders, sales and risk managers, building software to automate their work and improve their efficiency. The focus of this role will be to build new tools for bankwide analytics and server-side derivatives pricing infrastructure.

You will use Haskell for almost all tasks: data analysis, market data publishing, database access, web services, desktop GUIs, large parallel tasks, quantitative models, solvers, everything. This is a fast paced role – code you write today will be deployed within hours to hundreds of users and has to work.

These are permanent, associate director and director positions, in London and Singapore as part of the Strats global team. Demonstrated experience in typed FP (Haskell, OCaml, F# etc) is required. We have around 3 million lines of Haskell, and our own Haskell compiler. In this context we look for skill and taste in typed functional programming to capture and abstract over complex, messy systems. You would join a growing team of around 16 experienced Haskell developers that is expanding to 25+ due to increased business need for Haskell developers.

Experience writing typed APIs to external systems such as databases, web services, pub/sub platforms is very desirable. We like working code, so if you have Hackage or github libraries, we definitely want to see them. We also like StackOverflow answers, blog posts, academic papers, or other arenas where you can show broad FP ability. A PhD or Masters Degree in Computer Science is an advantage but not a requirement.

The role requires physical presence on the trading floor in Singapore or London. Remote work is not an option. You will have some project and client management skills — you will talk to users, understand their problems and then implement and deliver what they really need. No financial background is required. These positions have attractive remuneration for the right candidates. Relocation support will also be provided. Contracting-based positions are also possible if desired.

Applicants who don’t necessarily meet all criteria but have an interest in working in Singapore in particular, and have an FP background, are encouraged to apply.

More info about our development process is in the 2012 PADL keynote, and a 2013 HaskellCast interview.

If this sounds exciting to you, please send your resume to me – donald.stewart <at> sc.com

### International summer school on metaprogramming(Cambridge, 8-12 Aug 2016)

### how to reify an API?

### Summer School on Bidirectional Transformations, Oxford,25-29th July 2016

### Roman Cheplyaka: Surprising reciprocity

I have two correlated random variables, \(X\) and \(Y\), with zero mean and equal variance. I tell you that the best way to predict \(Y\) based on the knowledge of \(X\) is \(y = a x\). Now, you tell me, what is the best way to predict \(X\) based on \(Y\)?

Your intuition might tell you that if \(y = ax\), then \(x = y/a\). This is correct most of the time… but not here. The right answer will surprise you.

So *what is* the best way to predict \(Y\) based on \(X\) and vice versa? Let’s find the \(a\) that minimizes the mean squared error \(E[(Y-aX)^2]\):

\[E[(Y-aX)^2] = E[Y^2-2aXY+a^2X^2]=(1+a^2)\mathrm{Var}(X)-2a\mathrm{Cov}(X,Y);\]

\[\frac{\partial}{\partial a}E[(Y-aX)^2] = 2a\mathrm{Var}(X)-2\mathrm{Cov}(X,Y);\]

\[a=\frac{\mathrm{Cov}(X,Y)}{\mathrm{Var}(X)}=\mathrm{Corr}(X,Y).\]

Notice that the answer, the (Pearson) correlation coefficient, is symmetric w.r.t. \(X\) and \(Y\). Thus it will be the same whether we want to predict \(Y\) based on \(X\) or \(X\) based on \(Y\)!

How to make sense of this? It may help to consider a couple of special cases first.

First, suppose that \(X\) and \(Y\) are perfectly correlated and you’re trying to predict \(Y\) based on \(X\). Since \(X\) is such a good predictor, just use its value as it is (\(a=1\)).

Now, suppose that \(X\) and \(Y\) are uncorrelated. Knowing the value of \(X\) doesn’t tell you anything about the value of \(Y\) (as far as linear relationships go). The best predictor you have for \(Y\) is its mean, \(0\).

Finally, suppose that \(X\) and \(Y\) are *somewhat* correlated. The correlation coefficient is the degree to which we should trust the value of \(X\) when predicting \(Y\) versus sticking to \(0\) as a conservative estimate.

This is the key idea—to think about \(a\) in \(y=ax\) not as a degree of proportionality, but as a degree of “trust”.

### How to Build Static Checking Systems Using Orders of Magnitude Less Code

How to Build Static Checking Systems Using Orders of Magnitude Less Code, by Fraser Brown Andres Notzli Dawson Engler:

Modern static bug finding tools are complex. They typically consist of hundreds of thousands of lines of code, and most of them are wedded to one language (or even one compiler). This complexity makes the systems hard to understand, hard to debug, and hard to retarget to new languages, thereby dramatically limiting their scope.

This paper reduces the complexity of the checking system by addressing a fundamental assumption, the assumption that checkers must depend on a full-blown language specification and compiler front end. Instead, our program checkers are based on drastically incomplete language grammars (“micro-grammars”) that describe only portions of a language relevant to a checker. As a result, our implementation is tiny—roughly 2500 lines of code, about two orders of magnitude smaller than a typical system. We hope that this dramatic increase in simplicity will allow developers to use more checkers on more systems in more languages.

We implement our approach in µchex, a language-agnostic framework for writing static bug checkers. We use it to build micro-grammar based checkers for six languages (C, the C preprocessor, C++, Java, JavaScript, and Dart) and find over 700 errors in real-world projects.

Looks like an interesting approach with some compelling results, and will make a good tool for the toolbox. See also the Reddit thread for further discussion.

### Christoph Breitkopf: Weigh your Haskell code

*weigh*library to check memory use much like criterion enables you to check speed. His blog post contains a convincing example how looking at the allocations can point you to major speedups, making me eager to check IntervalMap for such problems.

Implementing the first alloc benchmark was straightforward. It took me only a few minutes to produce some comparisons between Data.Set and IntervalSet:

Case Bytes GCs Check Data.Set fromList 1000 440,496 0 OK IntervalSet fromList 1000 2,643,136 1 OK Data.Set fromAscList 989 126,144 0 OK IntervalSet fromAscList 989 224,688 0 OK Data.Set fromDistinctAscList 989 71,816 0 OK IntervalSet fromDistinctAscList 989 163,440 0 OK Data.Set mapMonotonic id 989 39,560 0 OK IntervalSet mapMonotonic id 989 159,520 0 OK This did not look exactly good. My feeling was that IntervalSet should use more memory, but not that much more. Both use node-oriented binary tree data structures. The nodes in IntervalSet have one more field for the interval search information, but that should be at most a 25% difference, not a factor of 4.

The simplest of these functions happens to be mapMonotonic. It just maps a function over the values, keeping the tree structure unchanged. So, when calling it with the id function, we would expect it to allocate one new node per value in the set, and nothing else. Nodes in Data.Set have 4 fields, and should take 40 bytes on a 64-bit machine. So mapMonotonic id on a size 1000 set should allocate 40,000 bytes, plus maybe some padding and a little general overhead. There are 989 unique values in the test data, so the allocation by Data.Set is perfect: 39,560 = 989×40 bytes. IntervalSet nodes have one more field, giving 48 bytes per node, but it allocates about 160,000 instead of the expected 47,472 bytes.

Let's look at the implementation:

mapMonotonic _ Nil = Nil mapMonotonic f (Node c k _ l r) = mNode c (f k) (mapMonotonic f l) (mapMonotonic f r) The mNode function is just the Node constructor augmented with the computation of the interval search information.

In Haskell, one possibility for allocations is Laziness: a computation is stored on the heap (a "thunk") to be done later. This does happen when mapping values over a tree, but both Set and IntervalSet are defined to be

*spine-strict*, which means that the structure of the tree is always fully evaluated. Technically, it means that the node constructor will evaluate the left and right subtree arguments.

The mNode function turned out to be strict in all its arguments, but the interval annotation uses some functions from the Interval typeclass where the compiler can't infer strictness. Adding strictness via bang-patterns gave a small improvement:

Case Original New IntervalSet fromList 1000 2,643,136 2,302,576 IntervalSet fromAscList 989 224,688 205,608 IntervalSet fromDistinctAscList 989 163,440 143,920 IntervalSet mapMonotonic id 989 159,520 142,320 At this point I probably should have looked at the GHC produced core, to see where exactly the allocation happens. But I still decided to do some more code tuning first because I knew about one major case for inefficiency in the node construction even without allocation being necessarily involved. The node construction needs to determine the interval with the largest upper bound from two intervals, and that operation is not directly present in the Interval typeclass. Just comparing the upper bounds is not enough, since one of the intervals could be open and the other closed, in which case equal upper bounds turn out not to be equal after all. So this operation has to be synthesized from upperBound and rightClosed:

compareUpperBounds a b = case compare (upperBound a) (upperBound b) of LT -> LT GT -> GT EQ -> case (rightClosed a, rightClosed b) of (False, True) -> LT (True, False) -> GT _ -> EQ For most actual instances of Interval, the rightClosed function will return a constant value (e.g., rightClosed = const True), so that this function could be simplified to a single compare. But GHC can do this only if knows and can inline the concrete Interval instance used. This requires liberal use of the INLINABLE pragma and can lead to code and compile-time bloat. Data.Set does this to some extent, but I don't think it is feasible for IntervalSet.

So I bit the bullet and added this function and a default implementation to the Interval typeclass. Results:

Case Original New IntervalSet fromList 1000 2,643,136 576,208 IntervalSet fromAscList 989 224,688 110,760 IntervalSet fromDistinctAscList 989 163,440 47,488 IntervalSet mapMonotonic id 989 159,520 47,472 About perfect! I should add that along with the reduction in allocations comes a speed improvement of about 30 - 50% for all operations that manipulate sets. Same for IntervalMaps. Is it worth adding otherwise unneeded functions to the typeclass for that? I think yes. A new version of IntervalMap incorporating these speedups is already up on Hackage.

### LOPSTR 2016: 2nd Call for Papers

### Functional Jobs: CTO (Haskell/Clojure) at Capital Match (Full-time)

**Overview**

Capital Match (www.capital-match.com) is a leading peer-to-peer lending platform in Singapore. Our in-house platform, mostly developed in Haskell, has in the last year seen more than USD 6 million business loans processed with a strong monthly growth (current rate of USD 1-2 million monthly).

We have just secured another funding round to build a world-class technology as the key business differentiator. The key components include credit risk engine, seamless banking integration and end-to-end product automation from loan origination to debt collection.

**Responsibilities**

We are looking to hire an experienced software engineer (min. 5-8 years coding experience) with a team lead and product management experience to lead the efforts. The current tech team includes a product manager and 3 software engineers. CTO would directly report to CEO and the Board of Directors.

The candidate should have been involved in a development of multiple web-based products from scratch. He should be interested in all aspects of the creation, growth and operations of a secure web-based platform: front-to-back features development, distributed deployment and automation in the cloud, build and test automation etc.

Background in fintech and especially lending space would be a great advantage.

**Requirements**

Our platform is primarily developed in Haskell with an Om/ClojureScript frontend. We are expecting our candidate to have experience working with a functional programming language e.g. Haskell/Scala/OCaml/F#/Clojure/Lisp/Erlang.

Deployment and production is managed with Docker containers using standard cloud infrastructure so familiarity with Linux systems, command-line environment and cloud-based deployment is mandatory. Minimum exposure to and understanding of XP practices (TDD, CI, Emergent Design, Refactoring, Peer review and programming, Continuous improvement) is expected.

We are looking for candidates that are living in or are willing to relocate to Singapore.

**Offer**

We offer a combination of salary and equity depending on experience and skills:

**Salary:** USD 6,000-8,000 / month

**Equity:** ca. 1% (subject to vesting)

European citizens who relocate to Singapore do not have to pay their home country taxes and the local tax rate in Singapore is more or less 5% (effective on the proposed salary range). Visa sponsorship will be provided.

Singapore is a great place to live, a vibrant city rich with diverse cultures, a very strong financial sector and a central location in Southeast Asia.

Get information on how to apply for this position.

### Functional Jobs: Software Engineer (Haskell/Clojure) at Capital Match (Full-time)

**Overview**

Capital Match (www.capital-match.com) is a leading peer-to-peer lending platform in Singapore. Our in-house platform, mostly developed in Haskell, has in the last year seen more than USD 6 million business loans processed with a strong monthly growth (current rate of USD 1-2 million monthly).

We have just secured another funding round to build a world-class technology as the key business differentiator. The key components include credit risk engine, seamless banking integration and end-to-end product automation from loan origination to debt collection.

**Responsibilities**

We are looking to hire a software engineer with a minimum of 2-3 years coding experience. The current tech team includes a product manager and 3 software engineers. We are currently also in the process of hiring CTO.

The candidate should have been involved in a development of multiple web-based products from scratch. He should be interested in all aspects of the creation, growth and operations of a secure web-based platform: front-to-back features development, distributed deployment and automation in the cloud, build and test automation etc.

Background in fintech and especially lending space would be a great advantage.

**Requirements**

Our platform is primarily developed in Haskell with an Om/ClojureScript frontend. We are expecting our candidate to have experience working with a functional programming language e.g. Haskell/Scala/OCaml/F#/Clojure/Lisp/Erlang.

Deployment and production is managed with Docker containers using standard cloud infrastructure so familiarity with Linux systems, command-line environment and cloud-based deployment is mandatory. Minimum exposure to and understanding of XP practices (TDD, CI, Emergent Design, Refactoring, Peer review and programming, Continuous improvement) is expected.

We are looking for candidates that are living in or are willing to relocate to Singapore.

**Offer**

We offer a combination of salary and equity depending on experience and skills:

**Salary:** USD 3,500-5,000 / month

**Equity:** ca. 0.5% (subject to vesting)

European citizens who relocate to Singapore do not have to pay their home country taxes and the local tax rate in Singapore is more or less 5% (effective on the proposed salary range). Visa sponsorship will be provided.

Singapore is a great place to live, a vibrant city rich with diverse cultures, a very strong financial sector and a central location in Southeast Asia.

Get information on how to apply for this position.

### Douglas M. Auclair (geophf): May 2016 1HaskellADay Problems and Solutions

**May 2016**

- May 31st, 2016: Today's #haskell problem gets a subsequence of DNA from ... DNA! http://lpaste.net/1527607897389793280 Today's #haskell solution: DNA subsequence GOT http://lpaste.net/5828328688629841920
- May 30th, 2016: We find the number of RNA strands that can generate a protein ... modulo reasonableness for today's #haskell problem http://lpaste.net/7114368903530151936 Today's #haskell solution shows us there are 'more than a few' RNA strands to compose some proteins http://lpaste.net/3827843057999413248
- May 27th, 2016: For today's #haskell problem we sing: No woman, Nah Trie! http://lpaste.net/8137189406290214912 Or: Breath-first Trie Traversal via the branches. We take breathing-lessons for today's #haskell solution http://lpaste.net/2088953736360624128
- May 26th, 2016: For today's #haskell problem we label the nodes of a Trie ... http://lpaste.net/658981668358455296 or at least we ... TRIE to! *groan Today's #haskell solution is a depth-first traversal of a Triehttp://lpaste.net/7964088321452277760
- May 25th, 2016: Today's #haskell problem ask you to exercise your probability muscles and look at some allele-pairingshttp://lpaste.net/7098281613896712192 That's all.
- May 24th, 2016: Reverse Palindromes in DNA strands?Yes. http://lpaste.net/4547532761941934080 Today's #haskell problem. Turns out there are lots of palindromes in DNA strands as today's #haskell solution shows. http://lpaste.net/8959978868165312512 Go figure.
- May 23rd, 2016: Splice, splice, baby! We get all Vanilla Ice with RNA with today's #haskell problem http://lpaste.net/2811938573572374528 We get our protein when with today's hand-rolled List-y-(\\) #haskell solution http://lpaste.net/5823105188059152384
- May 20th, 2016: For today's #haskell problem we consider the weighty matter of ... weighing matter http://lpaste.net/7500383126527410176 The weight's the thing Wherein I'll catch the conscience of the King ... no ... weight ...Today's #haskell solution http://lpaste.net/7743348996965400576
- May 19th, 2016: Today's #haskell problem looks at the Carmina Buranahttp://lpaste.net/684224423812661248 ... uh, no, it doesn't: Open Reading Frames. Today's #haskell solution sequences DNA strands to proteins http://lpaste.net/792645775074000896
- May 18th, 2016: If at first you do not succeed, trie and trie again today's #haskell problem http://lpaste.net/8558987235213443072 #trie #datatype
- May 17th, 2016: "Houston. We have a problem." #bigdata factors into today's #haskell problem of longest monotonic sequenceshttp://lpaste.net/6778179741435297792
- May 16th, 2016: Today's #Haskell problem asks for longest monotonic increasing/decreasing sequences ... in the small ... http://lpaste.net/7252474989977272320 So, the #haskell solution works against small data set, but since it's bifurcating, against large data sets? ... http://lpaste.net/7003206401061289984
- May 13th, 2016: #Rosalind is impatient. She doesn't have all day! Let's find a real-world solution to today's #haskell problem http://lpaste.net/8254705170412208128 Finding the longest common gene subsequence of 100 strands each 1000 nucleotides long in #haskell http://lpaste.net/2468395370904813568 Can you find faster?
- May 12th, 2016: We propose a little problem from Set Theory for today's #haskell problem http://lpaste.net/4050317579338645504 And we have today's #haskell in-a-perfect-world solution of the set-intersection of all subsequenceshttp://lpaste.net/1682855623517011968
- May 11th, 2016: Today's #haskell problem is all about the inods ... not 'inodes,' but 'internal nodes of an unrooted binary tree.' http://lpaste.net/7455193258755358720 "inods?" you say? "Of an unrooted binary tree?" you say? That would be pred . pred for today's #haskell solution http://lpaste.net/4766815087493120000
- May 10th, 2016: For today's #haskell problem we take it to the next level, see? and get variable-length strings as enumerated tokens http://lpaste.net/3180555571975684096 We pulled the powerSet Set.fromList Set.toList switcheroo for today's #haskell solution. http://lpaste.net/7501776843414962176
- May 9th, 2016: For today's #haskell problem we sing our A, B, Cs ... #Rosalind-style! http://lpaste.net/1445499963915108352 Today's #Haskell solution enumerates lexed tokens http://lpaste.net/503582828101894144
- May 6th, 2016: For today's #haskell problem we ask #Rosalind and she says 'Permutations.' So that's what we do! http://lpaste.net/6558920694606856192 This is the dawning of the age of Aquarius! "Hair" a movie about perms for today's #haskell solution http://lpaste.net/934285743432400896 @NickiElson3D
- May 5th, 2016: Today's #haskell problem: p-distance Matriceshttp://lpaste.net/818771459840147456 Believe you me: you will get a LOT of continued employment with these Today's #haskell solution is filed under #jobsecurity http://lpaste.net/5076187573303377920 #distanceMatrix
- May 4th, 2016: Today's #haskell problem we look at completing a graph into the Tree of Life http://lpaste.net/1315899789614776320 #Rosalind Today's #haskell solution is graph-y, or also take the approach of @bazzargh http://lpaste.net/1052721051462533120
- May 3rd, 2016: Today's #haskell problem looks at overlapping graphs, specifically for DNA strands from rosalind.info http://lpaste.net/8920617631791185920 Did you know a list is a graph? Today's #haskell solution was an elaboration of `isSuffixOf` http://lpaste.net/5785600391169179648 #Rosalind
- May 2nd, 2016: For today's #haskell problem we look at transforming our data into d3: "Data-driven document"-JSONhttp://lpaste.net/224067301371019264 For the #haskell solution we get a bird's-eye view of the Top5s stocks http://lpaste.net/7060337987313205248