News aggregator

Roman Cheplyaka: How to prepare a good pull request

Planet Haskell - Sun, 09/18/2016 - 2:00pm
  1. A pull request should have a specific goal and have a descriptive title. Do not put multiple unrelated changes in a single pull request.

  2. Do not include any changes that are irrelevant to the goal of the pull request.

    This includes refactoring or reformatting unrelated code and changing or adding auxiliary files (.gitignore, .travis.yml etc.) in a way that is not related to your main changes.

  3. Make logical, not historical commits.

    Before you submit your work for review, you should rebase your branch (git rebase -i) and regroup your changes into logical commits.

    Logical commits achieve different parts of the pull request goal. Each commit should have a descriptive commit message. Logical commits within a single pull request rarely overlap in the lines of code they touch.

    If you want to amend your pull request, rewrite the branch and force-push it instead of adding new (historical) commits or creating a new pull request.

  4. Make clean commits. Run git diff or git show on your commits. It will show you issues like trailing whitespace or missing newlines at the end of the file.

.gitignore

My .gitignore policy is that the project-specific .gitignore file should only contain patterns specific for this project. For instance, if a test suite generates files *.out, this pattern belongs to the project’s .gitignore.

If a pattern is standard across a wide range of projects (e.g. *.o, or .stack-work for Haskell projects), then it belongs to the user-specific ~/.gitignore.

stack.yaml

(This section is specific to Haskell.)

My policy is to track stack.yaml inside the repo for applications, but not for libraries.

The rationale is that for an application, stack.yaml provides a useful bit of metainformation: which snapshot the app is guaranteed to build with. Additionally, non-programmers (or non-Haskell programmers) may want to install the application, and the presence of stack.yaml makes it easy for them.

These benefits do not apply to libraries. And the cost of including .stack.yaml is:

  • The snapshot version gets out of date quickly, so you need to update this file regularly.
  • This file is often changed temporarily (e.g. to test a specific version of a dependency), and if it is tracked, you need to pay attention not to commit those changes by accident.
Categories: Offsite Blogs

Tom Schrijvers: Doctoral or Post-Doctoral Position in Programming Languages Theory & Implementation

Planet Haskell - Fri, 09/16/2016 - 7:08pm
I am looking for a new member to join my research team in either a doctoral or post-doctoral position.

You can find more details here.
Categories: Offsite Blogs

wren gayle romano: Visiting Nara over the next week

Planet Haskell - Thu, 09/15/2016 - 10:39pm

I announced this on twitter a while back, but tomorrow I'm flying out to Nara Japan. I'll be out there all week for ICFP and all that jazz. It's been about a decade since last time I was in the Kansai region, and I can't wait. As I've done in the past, if you want to meet up for lunch or dinner, just comment below (or shoot me a tweet, email, etc).



comments
Categories: Offsite Blogs

Douglas M. Auclair (geophf): August 2016 1HaskellADay 1Liners

Planet Haskell - Thu, 09/15/2016 - 8:29am
  • August 20th, 2016: maybeify :: (a, Maybe b) -> Maybe (a, b)
    Define maybeify. Snaps for elegance.
    • Hardy Jones @st58 sequence
    • Bruno @Brun0Cad mapM id
    • Thomas D @tthomasdd {-# LANGUAGE TupleSections #-}
      mabeify (x,mY) = maybe Nothing (return . (x,)) mY
    • Андреев Кирилл @nonaem00 import "category-extras" Control.Functor.Strong
      maybeify = uncurry strength
    • bazzargh @bazzargh I can't beat 'sequence', but: uncurry (fmap.(,))
    • Nick @crazy_fizruk distribute (from Data.Distributive)
Categories: Offsite Blogs

Brent Yorgey: Meeting people at ICFP in Nara

Planet Haskell - Thu, 09/15/2016 - 7:53am

In less than 24 hours I’m getting on a plane to Japan (well, technically, Dallas, but I’ll get to Japan eventually). As I did last year, I’m making an open offer here: leave a comment on this post, and I will make a point of finding and meeting you sometime during the week! One person took me up on the offer last year and we had a nice chat over dinner.


Categories: Offsite Blogs

Manuel M T Chakravarty: This is the video of my Compose :: Melbourne keynote. I am...

Planet Haskell - Wed, 09/14/2016 - 9:00pm
<iframe allowfullscreen="allowfullscreen" frameborder="0" height="225" id="youtube_iframe" src="https://www.youtube.com/embed/9dk7_GDNocQ?feature=oembed&amp;enablejsapi=1&amp;origin=http://safe.txmblr.com&amp;wmode=opaque" width="400"></iframe>

This is the video of my Compose :: Melbourne keynote. I am making the case for purely functional graphics programming with Haskell playgrounds, including live programming a little game (for the second half of the talk).

Some of the code is hard to read in the video; you may like to refer to the slides.

Categories: Offsite Blogs

Jens Petersen: Stackage LTS 7 is released

Planet Haskell - Wed, 09/14/2016 - 12:01pm
The Stackage curator team is pleased to announce the initial release of Stackage LTS 7 for ghc-8.0.1. Released over 3.5 months after LTS 6.0, the biggest change introduced with LTS 7.0 is the move from ghc-7.10.3 to ghc-8.0.1.  There is also the usual large number of (major) version number bumps: in order to achieve this we used a new one-time pruning step in Stackage Nightly development dropping all packages with constraining Stackage upper-bounds.  Nevertheless LTS 7.0 has 1986 packages which is almost as many as LTS 6.0 with 1994 packages (latest 6.17 at the time of writing has 2002 packages) thanks to all the work of the community of Haskell maintainers. We are going to do the major pruning step earlier for current Stackage Nightly now that LTS 7 is branched from Nightly, which will give package maintainers even more time to get ready for LTS 8.
Categories: Offsite Blogs

The haskell-lang.org team: Updates for September 14, 2016

Planet Haskell - Wed, 09/14/2016 - 10:00am

The biggest update to the site is the addition of three new targeted "next steps" tutorials on the get started page for using Stack, aimed at Play (using the REPL), Script (single-file programs), and Build (full projects). Hopefully this will help people with different goals all get started with Haskell quickly.

In addition, we have included a few new tutorials:

Plus a few other minor edits throughout the site.

The complete diff can be found here.

Categories: Offsite Blogs

Jan Stolarek: Moving to University of Edinburgh

Planet Haskell - Wed, 09/14/2016 - 6:45am

I wanted to let you all know that after working for 8 years as a Lecturer at the Institute of Information Technology (Lodz University of Technology, Poland), I have received a sabbatical leave to focus solely on research. Yesterday I began my work as a Research Associate at the Laboratory for Foundations of Computer Science, University of Edinburgh. This is a two-year post-doc position. I will be part of the team working on the Skye project under supervision of James Cheney. This means that from now on I will mostly focus on developing the Links programming language.

Categories: Offsite Blogs

FP Complete: Working with data in Haskell

Planet Haskell - Wed, 09/14/2016 - 1:00am

In data mining or general exploration, it's common to need to easily access data efficiently and without ceremony. Typically, a programming language will be designed for this case specifically, like R, or a library will be written for it, like Python with the pandas library.

Implementing this in Haskell, we improve upon this area with all the benefits that come with using Haskell over Python or R, such as:

  • Safety - garbage collected memory, no need for pointers.
  • Performance - it's fast: If you write new algorithms in Haskell, they don't to be added to the language (like R) or written in C (like Python).
  • Concurrency - make use of concurrent algorithms trivially, and take advantage of your other CPU cores.
  • Maintainability - static types ensure safety at the time you write the program, and when you come back later to change them, it's harder to break them.

Let's look at an example of doing this in Haskell, and compare with how this is done in Python's pandas. The steps are:

  1. Download a zip file containing a CSV file.
  2. Unzip the file.
  3. Read through the CSV file.
  4. Do some manipulation of the data from the file.

In Haskell we have all the libraries needed (streaming HTTP, CSV parsing, etc.) to achieve this goal, so specifically for this post I've made a wrapper package that brings them together like pandas does. We have some goals:

  • Convenience We don't want to have to write more code than necessary while exploring data.
  • Constant memory Be able to process the file in constant memory space. If I have a 1GB file I don't want to have to load all 1GB into memory in one go.
  • Type-safe I would like that once parsed from CSV, I have a statically-typed data structure with proper types (integers, dates, text, etc.).
Python example

This example code was taken from Modern Pandas. In Python we request the web URL in chunks, which we then write to a file. Next, we unzip the file, and then the data is available as df, with column names downcased.

import zipfile import requests import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt r = requests.get('http://chrisdone.com/ontime.csv.zip', stream=True) with open("flights.csv", 'wb') as f: for chunk in r.iter_content(chunk_size=1024): if chunk: f.write(chunk) zf = zipfile.ZipFile("flights.csv.zip") filename = zf.filelist[0].filename fp = zf.extract(filename) df = pd.read_csv(fp, parse_dates="FL_DATE").rename(columns=str.lower)

Finally, we can look at the 5 rows starting at row 10, for the columns fl_date and tail_num, like this:

df.ix[10:14, ['fl_date', 'tail_num']]

=>

fl_date tail_num 10 2014-01-01 N002AA 11 2014-01-01 N3FXAA 12 2014-01-01 N906EV 13 2014-01-01 N903EV 14 2014-01-01 N903EVPython: good and bad

Good parts of the Python code:

  • Extracting from the Zip file was fairly easy.
  • We were able to specify for some fields to parse them as a different data type (parse_dates).
  • Referencing a range of the data was easy.

Bad parts of the Python code:

  • Reading the HTTP request was very verbose. We manually streamed the chunks to disk, which seems pointless.
  • It's not statically typed. Even though we parsed fl_date and tail_num, we can't be certain down the line if they still exist, or are of the right type.
  • We loaded the whole 100MB CSV into memory.

Let's compare with the solution I prepared in Haskell. While reading, you can also clone the repository that I put together:

$ git clone git@github.com:chrisdone/labels.git --recursive

The wrapper library created for this post is under labels-explore, and all the code samples are under labels-explore/app/Main.hs.

Haskell example

I prepared the module Labels.RunResourceT which provides us with some data manipulation functionality: web requests, unzipping, CSV parsing, etc.

{-# LANGUAGE TypeApplications, OverloadedStrings, OverloadedLabels, TypeOperators, DataKinds, FlexibleContexts #-} import Labels.RunResourceT main = runResourceT $ httpSource "http://chrisdone.com/ontime.csv.zip" responseBody .| zipEntryConduit "ontime.csv" .| fromCsvConduit @("fl_date" := Day, "tail_num" := String) (set #downcase True csv) .| dropConduit 10 .| takeConduit 5 .> tableSink

Output:

fl_date tail_num 2014-01-01 N002AA 2014-01-01 N3FXAA 2014-01-01 N906EV 2014-01-01 N903EV 2014-01-01 N903EV

Breaking this down, the src .| c .| c .> sink can be read like a UNIX pipe src | c | c > sink.

The steps are:

  • Make a web request for the zip file and yield a stream of bytes.
  • Unzip that stream of bytes and yield a stream of bytes of the CSV.
  • Parse from CSV into records of type ("fl_date" := Day, "tail_num" := String).
  • Specify the downcase option so we can deal with lower-case names.
  • Drop the first 10 results.
  • Take 5 of the remaining results.
  • Print the table out.

In this library the naming convention for parts of the pipline is:

  • fooSource -- something which is at the beginning of the pipeline, a source of streaming input.
  • fooConduit -- something which connects two parts of the pipeline together and perhaps does some transformations (such as parsing the Zip, CSV or other things).
  • fooSink -- something into which all streaming input is poured, and produces a final result.
Haskell: good parts

What's good about the Haskell version:

  • Reading the HTTP request was trivial.
  • It's statically typed. If I try to multiply the fl_date as a number, for example, or mistakenly write fl_daet, I'll get a compile error before ever running the program.
  • It achieves it all in a streaming fashion, with constant memory usage.

How is it statically typed? Here:

fromCsvConduit @("fl_date" := Day, "tail_num" := String) csv

We've statically told fromCsvConduit the exact type of record to construct: a record of two fields fl_date and tail_num with types Day and String. Below, we'll look at accessing those fields in an algorithm and demonstrate the safety aspect of this.

Swapping out pipeline parts

We can also easily switch to reading from file. Let's write that URL to disk, uncompressed:

main = runResourceT (httpSource "http://chrisdone.com/ontime.csv.zip" responseBody .| zipEntryConduit "ontime.csv" .> fileSink "ontime.csv")

Now our reading becomes:

main = runResourceT $ fileSource "ontime.csv" .| fromCsvConduit @("fl_date" := Day, "tail_num" := String) (set #downcase True csv) .| dropConduit 10 .| takeConduit 5 .> tableSinkData crunching

It's easy to perform more detailed calculations. For example, to display the number of total flights, and the total distance that would be travelled, we can write:

main = runResourceT $ fileSource "ontime.csv" .| fromCsvConduit @("distance" := Double) (set #downcase True csv) .| sinkConduit (foldSink (\table row -> modify #flights (+ 1) (modify #distance (+ get #distance row) table)) (#flights := (0 :: Int), #distance := 0)) .> tableSink

The output is:

flights distance 471949 372072490.0

Above we made our own sink which consumes all the rows, and then yielded the result of that downstream to the table sink, so that we get the nice table display at the end.

Type correctness

Returning to our safety point, imagine above we made some mistakes.

First mistake, I wrote modify #flights twice by accident:

- modify #flights (+ 1) (modify #distance (+ get #distance row) table)) + modify #flights (+ 1) (modify #flights (+ get #distance row) table))

Before running the program, the following message would be raised by the Haskell type checker:

• Couldn't match type ‘Int’ with ‘Double’ arising from a functional dependency between: constraint ‘Has "flights" Double ("flights" := Int, "distance" := value0)’ arising from a use of ‘modify’

See below for where this information comes from in the code:

main = runResourceT $ fileSource "ontime.csv" .| -- -- The distance field is actually a double -- ↓ -- fromCsvConduit @("distance" := Double) (set #downcase True csv) .| sinkConduit (foldSink (\table row -> modify #flights (+ 1) (modify #flights (+ get #distance row) table)) -- -- But we're trying to modify `#flights`, which is an `Int`. -- ↓ -- (#flights := (0 :: Int), #distance := 0)) .> tableSink

Likewise, if we misspelled #distance as #distant, in our algorithm:

- modify #flights (+ 1) (modify #distance (+ get #distance row) table)) + modify #flights (+ 1) (modify #distance (+ get #distant row) table))

We would get this error message:

No instance for (Has "distant" value0 ("distance" := Double)) arising from a use of ‘get’

Summarizing:

  • The correct values being parsed from the CSV.
  • Fields must exist if we're accessing them.
  • We can't mismatch types.

All this adds up to more maintainable software, and yet we didn't have to state any more than necessary!

Grouping

If instead we'd like to group by a field, in pandas it's like this:

first = df.groupby('airline_id')[['fl_date', 'unique_carrier']].first() first.head()

We simply update the code with the type, putting the additional fields we want to parse:

csv :: Csv ("fl_date" := Day, "tail_num" := String ,"airline_id" := Int, "unique_carrier" := String)

And then our pipeline instead becomes:

fromCsvConduit @("fl_date" := Day, "tail_num" := String, "airline_id" := Int, "unique_carrier" := String) (set #downcase True csv) .| groupConduit #airline_id .| explodeConduit .| projectConduit @("fl_date" := _, "unique_carrier" := _) .| takeConduit 5 .> tableSink
  • We added the two new fields to be parsed.
  • We grouped by the #airline_id field into a stream of lists of rows. That groups the stream [x,y,z,a,b,c] into e.g. [[x,y],[z,a],[b,c]].
  • We explode those groups [[x,y],[z,a],[b,c],...] into a stream of each group's parts: [x,y,z,a,b,c,...].
  • We projected a new record type just for the table display to include fl_date and unique_carrier. The types are to be left as-is, so we use _ to mean "you know what I mean". This is like SELECT fl_date, unique_carrier in SQL.

Output:

unique_carrier fl_date AA 2014-01-01 AA 2014-01-01 EV 2014-01-01 EV 2014-01-01 EV 2014-01-01

The Python blog post states that a further query upon that result,

first.ix[10:15, ['fl_date', 'tail_num']]

yields an unexpected empty data frame, due to strange indexing behaviour of pandas. But ours works out fine, we just drop 10 elements from the input stream and project tail_num instead:

dropConduit 10 .| projectConduit @("fl_date" := _, "tail_num" := _) .| takeConduit 5 .> tableSink

And we get

fl_date tail_num 2014-01-01 N002AA 2014-01-01 N3FXAA 2014-01-01 N906EV 2014-01-01 N903EV 2014-01-01 N903EVConclusion

In this post we've demonstrated:

  1. Concisely handling a chain of problems smoothly like a bash script.
  2. Done all the above in constant memory usage.
  3. Done so with a type-safe parser, specifying our types statically, but without having to declare or name any record type ahead of time.

This has been a demonstration, and not a finished product. Haskell needs work in this area, and the examples in this post are not performant (but could be), but such work would be very fruitful.

Are the advantages of using Haskell something you're interested in? If so, contact us at FP Complete.

Categories: Offsite Blogs

FP Complete: Practical Haskell: Simple File Mirror (Part 1)

Planet Haskell - Wed, 09/14/2016 - 1:00am

The other day I threw together a quick program to solve an annoyance some people on our team were expressing. We sometimes do our development on remote machines, but would still like to use local file editors. There are plenty of solutions for this (SSHFS, inotify+rsync), but none of us ever found a prebaked solution that was low-latency and robust enough to use regularly.

The program I put together is a simple client/server combo, where the client watches for local file changes and sends the updates to the server, which writes them to disk. This approach reduces latency significantly by keeping an open TCP connection, and can be tunneled over SSH for security. It's simple, and the code is short and readable.

This program turned out to be a good demonstration of a few different common problem domains when writing practical, real world Haskell code:

  • Communication protocols
  • Streaming of data
  • Network communications
  • File monitoring (eh, probably not that common, but cool)
  • Command line option parsing

This blog post series will build up from basics to being able to implement a program like the simple file mirror. This first post of three planned posts will cover communication protocols and streaming of data. You can also read:

  • Part 2, on network communication and concurrency

Prereqs

This series will contain a number of self-contained code samples, presented as runnable scripts. In order to run these scripts, copy them into a file ending with .hs (e.g., practical.hs), and then run it with stack practical.hs). You will need to install the Stack build tool. Stack will take care of downloading and installing any compiler and libraries necessary. Because of that: your first time running a script will take a bit of time while downloading and installing.

If you'd like to test that you're ready to proceed, try running this program:

#!/usr/bin/env stack -- stack --resolver nightly-2016-09-10 --install-ghc runghc main :: IO () main = putStrLn "Hello world!"

Also, this series of posts will use the classy-prelude-conduit library, which provides a lot of built-in functionality to make it easier to follow. If you'd like to kick off a build of that library now so it's available when you want to use it, run:

$ stack --resolver nightly-2016-09-10 --install-ghc build classy-prelude-conduitTextual versus binary data

We humans have an interesting tendancy to want to communicate - with each other and our computers - in human languages. Computers don't care: they just see binary data. Often times, this overlaps just fine, specifically when dealing with the ASCII subset of the character set. But generally speaking, we need some way to distinguish between textual and binary data.

In Haskell, we do this by using two different data types: Text and ByteString. In order to convert between these two, we need to choose a character encoding, and most often we'll choose UTF-8. We can perform this conversion with the encodeUtf8 and decodeUtf8 functions:

#!/usr/bin/env stack -- stack --resolver nightly-2016-09-10 --install-ghc runghc --package classy-prelude-conduit {-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE OverloadedStrings #-} import ClassyPrelude.Conduit main :: IO () main = do let someText :: Text someText = unlines [ "Hello, this is English." , "Hola, este es el español." , "שלום, זה עברית." ] binary :: ByteString binary = encodeUtf8 someText filePath :: FilePath filePath = "say-hello.txt" writeFile filePath binary binary2 <- readFile filePath :: IO ByteString let text2 = decodeUtf8 binary2 putStrLn text2

If you're coming from a language that doesn't have this strong separation, it may feel clunky at first. But after years of experience with having a type-level distinction between textual and binary data, I can attest to the fact that the compiler has many times prevented me from making some mistakes that would have been terrible in practice. One example popped up when working on the simple-file-mirror tool itself: I tried to send the number of characters in a file path over the wire, instead of sending the number of bytes after encoding it.

The OverloadedStrings language extension lets us use string literals for all of these string-like things, including (as you saw above) file paths.

NOTE: For historical reasons which are being worked on, there is additionally a String type, which is redundant with Text but far less efficient. I mention it because you may see references to it when working with some Haskell documentation. Whenever possible, stick to Text or ByteString, depending on your actual data content.

Some simple data streaming

OK, that was a bit abstract. Let's do something more concrete. We're going to use the conduit library to represent streaming data. To start off simple, let's stream the numbers 1 to 10 and print each of them:

#!/usr/bin/env stack -- stack --resolver nightly-2016-09-10 --install-ghc runghc --package classy-prelude-conduit {-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE OverloadedStrings #-} import ClassyPrelude.Conduit main :: IO () main = yieldMany [1..10] $$ mapM_C print

Our yieldMany function will take a sequence of values - in this case a list from 1 to 10 - and yield them downstream. For those familiar, this is similar to yielding in Python with generators. The idea is that we will build a pipeline of multiple components, each awaiting for values from upstream and yielding values downstream.

Our mapM_C print component will apply the function print to every value it receives from upstream. The C suffix is used for disambiguating conduit functions from non-conduit functions. Finally, the $$ operator in the middle is the "connect" function, which connects a source of data to a sink of data and runs it. As you might guess, the above prints the numbers 1 to 10.

We can also put other components into our pipeline, including functions that both await values from upstream and yield them downstream. For example:

#!/usr/bin/env stack -- stack --resolver nightly-2016-09-10 --install-ghc runghc --package classy-prelude-conduit {-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE OverloadedStrings #-} import ClassyPrelude.Conduit main :: IO () main = yieldMany [1..10] $$ mapC (* 10) =$ mapM_C print

The mapC function applies a function to each value in the stream and yields it downstream, while the =$ operator connects two components together. Take a guess at what the output from this will be, and then give it a try.

NOTE There's a subtle but important different between $$ and =$. $$ will connect two components and then run the result to completion to get a result. =$ will connect two components without running them so that it can be further connected to other components. In a single pipeline, you'll end up with one $$ usage.

We can also do more interesting consumption of a stream, like summing:

#!/usr/bin/env stack -- stack --resolver nightly-2016-09-10 --install-ghc runghc --package classy-prelude-conduit {-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE OverloadedStrings #-} import ClassyPrelude.Conduit main :: IO () main = do total <- yieldMany [1..10] $$ mapC (* 10) =$ sumC print total -- or more explicitly... total <- yieldMany [1..10] $$ mapC (* 10) =$ foldlC (+) 0 print total

Or limit how much of the stream we want to consume:

#!/usr/bin/env stack -- stack --resolver nightly-2016-09-10 --install-ghc runghc --package classy-prelude-conduit {-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE OverloadedStrings #-} import ClassyPrelude.Conduit main :: IO () main = do total <- yieldMany [1..10] $$ mapC (* 10) =$ takeC 5 =$ sumC print total

This only scratches the surface of what we can do with conduit, but hopefully it gives enough of a basic intuition for the library to get started. If you're interested in diving in deep on the conduit library, check out the previously linked tutorial.

Streaming chunked data

Having a stream of individual bytes turns out to be inefficient in practice. It's much better to chunk a series of bytes into an efficient data structure like a ByteString. Let's see what it looks like to stream data from a file to standard output:

#!/usr/bin/env stack -- stack --resolver nightly-2016-09-10 --install-ghc runghc --package classy-prelude-conduit {-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE OverloadedStrings #-} import ClassyPrelude.Conduit import qualified System.IO as IO main :: IO () main = do writeFile "some-file.txt" ("This is just some text." :: ByteString) -- Yes, this is clunky. We'll do something much slicker in a bit. IO.withBinaryFile "some-file.txt" IO.ReadMode $ \fileHandle -> sourceHandle fileHandle $$ decodeUtf8C =$ stdoutC

This is good, but what if we want to deal with the individual bytes in the stream instead of the chunks. For example, let's say we want to get just the first 10 bytes of our file. The takeC function we used above would take the first five chunks of data. We instead need a function which will work on the elements of the bytestrings (the individual bytes). Fortunately, conduit provides for this with the E-suffixed element-specific functions:

#!/usr/bin/env stack -- stack --resolver nightly-2016-09-10 --install-ghc runghc --package classy-prelude-conduit {-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE OverloadedStrings #-} import ClassyPrelude.Conduit import qualified System.IO as IO main :: IO () main = do writeFile "some-file.txt" ("This is just some text." :: ByteString) -- Yes, this is clunky. We'll do something much slicker in a bit. IO.withBinaryFile "some-file.txt" IO.ReadMode $ \fileHandle -> sourceHandle fileHandle $$ takeCE 10 =$ decodeUtf8C =$ stdoutC

In the simple-file-mirror program, we will be sending files over the network, and will need to limit some operations to the actual file sizes in question. Functions like takeCE will be vital for doing this.

Managing resources

While the withBinaryFile approach above worked, it felt kind of clunky. And for more complicated control flows, opening up the file in advance won't be an option (like when we'll only know which file to open after the network connection tells us the file path). To allow for these cases, we're going to introduce the runResourceT, which allows us to acquire resources in an exception-safe manner. Let's rewrite the above example with runResourceT:

#!/usr/bin/env stack -- stack --resolver nightly-2016-09-10 --install-ghc runghc --package classy-prelude-conduit {-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE OverloadedStrings #-} import ClassyPrelude.Conduit main :: IO () main = do writeFile "some-file.txt" ("This is just some text." :: ByteString) runResourceT $ sourceFile "some-file.txt" $$ takeCE 10 =$ decodeUtf8C =$ stdoutC

Internally, sourceFile uses the bracketP function, which runs some initialization function (in our case, opening a file handle), registers some cleanup function (in our case, closing that file handle), and then performs an action with the resource. To demonstrate what that looks like more explicitly, let's write a modified sourceFile function which will return some default file contents if the requested file can't be read from.

#!/usr/bin/env stack -- stack --resolver nightly-2016-09-10 --install-ghc runghc --package classy-prelude-conduit {-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE OverloadedStrings #-} import ClassyPrelude.Conduit import qualified System.IO as IO sourceFileDef :: MonadResource m => FilePath -> Source m ByteString sourceFileDef fp = do let -- tryIO catches all IO exceptions and puts them in a Left -- value. open = tryIO $ IO.openBinaryFile fp IO.ReadMode -- Exception occurred, so nothing to close. close (Left e) = putStrLn $ "No file to close, got an exception: " ++ tshow e -- No exception, so close the file handle. close (Right h) = hClose h bracketP open close $ \eitherHandle -> case eitherHandle of Left ex -> do yield "I was unable to open the file in question:\n" yield $ encodeUtf8 $ tshow ex ++ "\n" Right fileHandle -> sourceHandle fileHandle main :: IO () main = runResourceT $ sourceFileDef "some-file.txt" $$ decodeUtf8C =$ stdoutCImplementing our protocol

Let's at least get started on our actual simple-file-mirror code. The wire protocol we're going to use is defined in the README, but we can describe it briefly as:

  • Client sends data to server, server never sends to client
  • A line like 9:hello.txt11:Hello World means "write the file hello.txt with the content Hello World"
  • We provide lengths of both the file paths and the file contents with a decimal-encoded integer followed by a colon (similar to netstrings, but without the trailing comma)
  • If a file has been deleted, we use a special length of -1, e.g. 9:hello.txt-1: means "hello.txt was deleted"

So let's get started with implementing the integer send/receive logic in conduit. There are many ways of doing this, some more efficient than others. For this program, I elected to go with the simplest approach possible (though you can see some historical more complicated/efficient versions). Let's start with sending, which is pretty trivial:

#!/usr/bin/env stack -- stack --resolver nightly-2016-09-10 --install-ghc runghc --package classy-prelude-conduit {-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE RankNTypes #-} import ClassyPrelude.Conduit sendInt :: Monad m => Int -> Producer m ByteString sendInt i = yield $ encodeUtf8 $ tshow i ++ ":" main :: IO () main = yieldMany sampleInts $$ awaitForever sendInt =$ stdoutC where sampleInts = [ 1 , 10 , -5 , 0 , 60 ]

Here we've introduced a new function, awaitForever, which repeatedly applies a function as long as data exists on the stream. Take a guess at what the output of this program will be, and then try it out.

Now let's try out the receiving side of this, which is slightly more complicated, but not too bad:

#!/usr/bin/env stack {- stack --resolver nightly-2016-09-10 --install-ghc runghc --package classy-prelude-conduit --package word8 -} {-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE RankNTypes #-} import ClassyPrelude.Conduit import Data.Word8 (_colon) sendInt :: Monad m => Int -> Producer m ByteString sendInt i = yield $ encodeUtf8 $ tshow i ++ ":" recvInt :: MonadThrow m => Consumer ByteString m Int recvInt = do intAsText <- takeWhileCE (/= _colon) =$ decodeUtf8C =$ foldC dropCE 1 -- drop the colon from the stream case readMay $ unpack intAsText of Nothing -> error $ "Invalid int: " ++ show intAsText Just i -> return i main :: IO () main = yieldMany sampleInts $$ awaitForever sendInt =$ peekForeverE (do i <- recvInt print i) where sampleInts = [ 1 , 10 , -5 , 0 , 60 ]

peekForeverE is similar to awaitForever, in that it repeatedly performs an action as long as there is data on the stream. However, it's different in that it doesn't grab the data off of the stream itself, leaving it to the action provided to do that, and it deals correctly with chunked data by ignoring empty chunks.

We've also introduced takeWhileCE, which is like takeCE, but instead of giving it a fixed size of the stream to consume, it continues consuming until it finds the given byte. In our case: we consume until we get to a colon. Then we decode into UTF-8 data, and use foldC to concatenate multiple chunks of Text into a single Text value. Then we use readMay to parse the textual value into an Int. (And yes, we could do a much better job at error handling, but using error is the simplest approach.)

Building on top of these two functions, it becomes a lot easier to send and receive complete file paths. Let's put that code in here, together with a test suite to prove it's all working as expected:

#!/usr/bin/env stack {- stack --resolver nightly-2016-09-10 --install-ghc runghc --package classy-prelude-conduit --package word8 --package hspec -} {-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE RankNTypes #-} import ClassyPrelude.Conduit import Data.Word8 (_colon) import Test.Hspec import Test.Hspec.QuickCheck (prop) sendInt :: Monad m => Int -> Producer m ByteString sendInt i = yield $ encodeUtf8 $ tshow i ++ ":" sendFilePath :: Monad m => FilePath -> Producer m ByteString sendFilePath fp = do -- UTF-8 encode the filepath let bs = encodeUtf8 $ pack fp :: ByteString -- Send the number of bytes sendInt $ length bs -- Send the actual path yield bs recvInt :: MonadThrow m => Consumer ByteString m Int recvInt = do intAsText <- takeWhileCE (/= _colon) =$ decodeUtf8C =$ foldC dropCE 1 -- drop the colon from the stream case readMay $ unpack intAsText of Nothing -> error $ "Invalid int: " ++ show intAsText Just i -> return i recvFilePath :: MonadThrow m => Consumer ByteString m FilePath recvFilePath = do -- Get the byte count fpLen <- recvInt -- Read in the given number of bytes, decode as UTF-8 text, and -- then fold all of the chunks into a single Text value fpText <- takeCE fpLen =$= decodeUtf8C =$= foldC -- Unpack the text value into a FilePath return $ unpack fpText main :: IO () main = hspec $ do prop "sendInt/recvInt are inverses" $ \i -> do res <- sendInt i $$ recvInt res `shouldBe` i prop "sendFilePath/recvFilePath are inverses" $ \fp -> do res <- sendFilePath fp $$ recvFilePath res `shouldBe` fp

We've used the hspec test framework library and its QuickCheck support to create a test suite which automatically generates test cases based on the types in our program. In this case, it will generate 100 random Ints and 100 random FilePaths each time it's run, and ensure that our properties hold. This is a great way to quickly get significant test coverage for a program.

Sending the files themselves

OK, finally time to put all of this together. We're going to add in some new functions for sending and receiving files themselves. This is a fairly simple composition of all of the work we've done until now. And this is the nice thing about Haskell in general, as well as the conduit library: purity and strong typing often make it possible to trivially combine smaller functions into more complicated beasts. Let's see this all in action:

#!/usr/bin/env stack {- stack --resolver nightly-2016-09-10 --install-ghc runghc --package classy-prelude-conduit --package word8 --package hspec --package temporary -} {-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE RankNTypes #-} import ClassyPrelude.Conduit import Data.Word8 (_colon) import System.Directory (createDirectoryIfMissing, doesFileExist, removeFile) import System.FilePath (takeDirectory) import System.IO (IOMode (ReadMode), hFileSize, openBinaryFile) import System.IO.Temp (withSystemTempDirectory) import Test.Hspec import Test.Hspec.QuickCheck (prop) sendInt :: Monad m => Int -> Producer m ByteString sendInt i = yield $ encodeUtf8 $ tshow i ++ ":" sendFilePath :: Monad m => FilePath -> Producer m ByteString sendFilePath fp = do -- UTF-8 encode the filepath let bs = encodeUtf8 $ pack fp :: ByteString -- Send the number of bytes sendInt $ length bs -- Send the actual path yield bs sendFile :: MonadResource m => FilePath -- ^ root directory -> FilePath -- ^ path relative to root -> Producer m ByteString sendFile root fp = do -- Send the relative file path, so the other side knows where to -- put the contents sendFilePath fp let open = tryIO $ openBinaryFile fpFull ReadMode -- If the opening failed, we'll have an error message. So -- there's nothing to close, just do nothing! close (Left _err) = return () -- Opening succeeded, so close the file handle. close (Right h) = hClose h -- Grab the file handle... bracketP open close $ \eh -> case eh of -- No file, send a -1 length to indicate file does not -- exist Left _ex -> sendInt (-1) -- File exists Right h -> do -- Send the size of the file size <- liftIO $ hFileSize h sendInt $ fromInteger size -- And stream the contents sourceHandle h where fpFull = root </> fp recvInt :: MonadThrow m => Consumer ByteString m Int recvInt = do intAsText <- takeWhileCE (/= _colon) =$ decodeUtf8C =$ foldC dropCE 1 -- drop the colon from the stream case readMay $ unpack intAsText of Nothing -> error $ "Invalid int: " ++ show intAsText Just i -> return i recvFilePath :: MonadThrow m => Consumer ByteString m FilePath recvFilePath = do -- Get the byte count fpLen <- recvInt -- Read in the given number of bytes, decode as UTF-8 text, and -- then fold all of the chunks into a single Text value fpText <- takeCE fpLen =$= decodeUtf8C =$= foldC -- Unpack the text value into a FilePath return $ unpack fpText recvFile :: MonadResource m => FilePath -- ^ directory to store files in -> Sink ByteString m () recvFile root = do -- Get the relative path to store in fpRel <- recvFilePath -- Prepend with the root directory to get a complete path let fp = root </> fpRel -- Get the size of the file fileLen <- recvInt if fileLen == (-1) -- We use -1 to indicate the file should be removed. Go ahead -- and call removeFile, but ignore any IO exceptions that -- occur when doing so (in case the file doesn't exist -- locally, for example) then liftIO $ void $ tryIO $ removeFile fp else do -- Create the containing directory liftIO $ createDirectoryIfMissing True $ takeDirectory fp -- Stream out the specified number of bytes and write them -- into the file takeCE fileLen =$= sinkFile fp main :: IO () main = hspec $ do prop "sendInt/recvInt are inverses" $ \i -> do res <- sendInt i $$ recvInt res `shouldBe` i prop "sendFilePath/recvFilePath are inverses" $ \fp -> do res <- sendFilePath fp $$ recvFilePath res `shouldBe` fp -- A more standard unit test, checking that sending and receiving -- a file does what is expected. it "create and delete files" $ -- Get temporary source and destination directories withSystemTempDirectory "src" $ \srcDir -> withSystemTempDirectory "dst" $ \dstDir -> do let relPath = "somepath.txt" content = "This is the content of the file" :: ByteString -- Ensure that sending a file that exists makes it appear in -- the destination writeFile (srcDir </> relPath) content runResourceT $ sendFile srcDir relPath $$ recvFile dstDir content' <- readFile (dstDir </> relPath) content' `shouldBe` content -- Ensure that sending a file that doesn't exist makes it -- disappear in the destination removeFile (srcDir </> relPath) runResourceT $ sendFile srcDir relPath $$ recvFile dstDir exists <- doesFileExist (dstDir </> relPath) exists `shouldBe` False

Our sendFile function looks very similar to our sourceFileDef example at the beginning of the post. But instead of streaming a default value, we just send a length of -1, as our protocol dictates. The recvFile function relies heavily on recvFilePath and recvInt. In the case of a -1, it removes the file in question. Otherwise, it creates the containing directory if necessary, and then composes takeCE with sinkFile to stream the correct number of bytes into the file.

We also have a unit test covering the interaction of these two new functions. While some kind of property could perhaps be devised for testing this with QuickCheck, a more standard unit test seemed far more straightforward in this case.

Next time on Practical Haskell

This part of the tutorial covered quite a number of topics, so this is a good place to take a break. Next time, we'll dive into the network communication aspect of things, including:

  • Implementing a really simple HTTP client
  • Implementing an echo server
  • Using some basic concurrency in Haskell to have a client and server in the same process

If you have feedback on how to make this tutorial series more useful, please share it in the comments below, on Twitter (@snoyberg), or in any Reddit discussions about it. I'll try to review feedback and make changes to parts 2 and 3.

Categories: Offsite Blogs

Functional Jobs: Software Engineer/Researcher at Galois Inc (Full-time)

Planet Haskell - Tue, 09/13/2016 - 4:43pm

We are currently seeking software engineers/researchers to play a pivotal role in fulfilling our mission to make critical systems trustworthy.

Galois engineers participate in one or more projects concurrently, and specific roles vary greatly according to skills, interests, and company needs. Your role may include technology research and development, requirements gathering, implementation, testing, formal verification, infrastructure development, project leadership, and/or supporting new business development.

Skills & Requirements

Education – Minimum of a Bachelor’s degree in computer science or equivalent. MS or PhD in CS or a related field desirable but optional, depending on specific role.

Required Technical Expertise – Must have hands-on experience developing software and/or performing computer science research. Demonstrated expertise in aspects of software development mentioned above.

Desired Technical Expertise – Fluency in the use of formal or semi-formal methods such as Haskell or other functional programming languages. Direct experience in developing high assurance systems and/or security products. Experience with identity management, security risk analysis, systems software, or networking.

Required General Skills – Must work well with customers, including building rapport, identifying needs, and communicating with strong written, verbal, and presentation skills. Must be highly motivated and able to self-manage to deadlines and quality goals.

Our engineers develop in programming languages including functional languages, designing and developing advanced technologies for safety- and security-critical systems, networks, and applications. Engineers work in small team settings and must successfully interact with clients, partners, and other employees in a highly cooperative, collaborative, and intellectually challenging environment.

We’re looking for people who can invent, learn, think, and inspire. We reward creativity and thrive on collaboration.

Get information on how to apply for this position.

Categories: Offsite Blogs

Edward Z. Yang: Seize the Means of Production (of APIs)

Planet Haskell - Mon, 09/12/2016 - 11:09pm

There's a shitty API and it's ruining your day. What do you do?

Without making a moral judgment, I want to remark that there is something very different about these two approaches. In Dropbox's case, Dropbox has no (direct) influence on what APIs Apple provides for its operating system. So it has no choice but to work within the boundaries of the existing API. (When Apple says jump, you say, "How high?") But in Adam's case, POSIX is implemented by an open source project Linux, and with some good ideas, Adam could implement his new interface on top of Linux (avoiding the necessity of writing an operating system from scratch.)

APIs cross social boundaries: there is the proletariat that produces software using an API and the bourgeoisie that controls the API. When the Man(TM) is a big corporation, our only choices are to work around their shitty API or pay them sufficiently large amounts of money to fix their shitty APIs. But when the Man(TM) is an open source project, your choices change. True, you could work around their shitty API. Or you could seize the means of production, thereby enabling you to fix the shitty API.

What do I mean by seize the means of production? Indeed, what are the means of production? An open source API does not live in a vacuum; it is made useful by the software that provides the API, the developers who contribute their time and expertise to maintain this expertise, even the publicity platform which convinces people to use an API. To seize the means of production is to gain control of all of these aspects. If you can convince the establishment that you are a core contributor of a piece of open source software, you in turn gain the ability to fix the shitty APIs. If you are unwilling or unable to do so, you might still fork, vendor or rewrite the project, but this is not so much seizing the means of production as it is recreating it from scratch. Another possibility is to build the abstraction you need on top of the existing APIs (as Adam did), though you are always at risk of the original project not being sensitive to your needs.

Time and time again, I see people working with open source projects who refuse to seize the means of production. Instead, they are willing to write increasingly convoluted workarounds to solve problems, all to stay within the boundaries. You might say, "This is just Getting Things Done(TM)", but at some point you will have done more work working around the problem, than you would have spent just fixing the damn thing.

So stop it. Stop putting up with shitty APIs. Stop working within the boundaries. Seize the means of production.

Counterpoint.

  1. What is advocated for in this post is nothing less than infinite yak shaving; if you take the advice seriously you will proceed to never get anything done ever again.
  2. It may be true that in aggregate, the cost of working around a bad API exceeds the cost to fix it, but for any individual the cost is generally less. Thus, even if you could perfectly predict the cost of a workaround versus a proper fix, individual incentives would prevent the proper fix.
  3. Users (including developers) don't know anything about the software they use and are ill-equipped to design better APIs, even if they know where it hurts.
  4. Rarely can you unilaterally seize the means of production. In an ideal world, to become a core contributor, it would merely be sufficient to demonstrate sustained, useful contributions to a project. We all know the real world is more messy.
Categories: Offsite Blogs

Sandy Maguire: Better Data Types a la Carte

Planet Haskell - Mon, 09/12/2016 - 6:00pm
<article> <header> Better Data Types a la Carte </header>

<time>September 13, 2016</time> haskell, rpg, dsl, data types a la carte, semantics

To be honest with you, my approach to procedurally generating RPG stories has been stymied until very recently. Recall the command functor:

data StoryF a = Change Character ChangeType (ChangeResult -> a) | forall x y. Interrupt (Free StoryF x) (Free StoryF y) (y -> a) | -- whatever else

This recursively defined Interrupt command has caused more than its fare share of grief. The idea is that it should represent one potential line of action being interrupted by another. The semantics are rather hazy, but this should result in grafting the Free StoryF y monad somewhere inside of the Free StoryF x monad. Once we’ve done whatever analysis on the original story, we can then forget that the Interrupt bit was ever there in the first place.

In effect, we want this:

data StoryF' a = Change Character ChangeType (ChangeResult -> a) | -- whatever else runInterrupt :: StoryF a -> StoryF' a runInterrupt = -- ???

where runInterrupt’s job is to remove any instances of the Interrupt command from our story – replacing them with the “canonical” version of what actually happened.

Of course, we could just remove all of the Interrupt data constructors from our Free StoryF a object, and rely on convention to keep track of that for us. However, if you’re like me, whenever the phrase “by convention” is used, big alarm bells should go off in your head. Convention isn’t enforced by the compiler, and so anything maintained “by convention” is highly suspect to bugs.

What would make our lives better is if we could define StoryF and StoryF' somehow in terms of one another, so that there’s no hassle to keep them in sync with one another. Even better, in the future, maybe we’ll want to remove or add other constructors as we interpret our story.

What we really want to be able to do is to mix and match individual constructors into one larger data structure, which we can then transform as we see fit.

Fortunately for us, the machinery for this has already been built. It’s Swierstra’s Data Types a la Carte (henceforth DTalC) – essentially a set of combinators capable of composing data types together, and tools for working with them in a manageable way.

Unfortunately for us, Data Types a la Carte isn’t as type-safe as we’d like it to be. Additionally, it’s missing (though not fundamentally) the primitives necessary to remove constructors.1

This post presents a variation of DTalC which is type-safe, and contains the missing machinery.

But first, we’ll discuss DTalC as it is described in the original paper, in order to get a feeling for the approach and where the problems might lie. If you know how DTalC works already, consider skipping to the next heading.

Data Types a la Carte

Data Types a la Carte presents a novel strategy for building data types out of other data types with kind2 * -> *. A code snippet is worth a thousand words, so let’s dive right in. Our StoryF command functor as described above would instead be represented like this:

data ChangeF a = Change Character ChangeType (ChangeResult -> a) data InterruptF a = forall x y. Interrupt (Free StoryF x) (Free StoryF y) (y -> a) type StoryF = ChangeF :+: InterruptF

Here, (:+:) is the type operator which composes data types together into a sum type (there is a corresponding (:*:) for products, but we’re less interested in it today.)

Because the kindedness of (:+:) lines up with that of the data types it combines, we can nest (:+:) arbitrarily deep:

type Something = Maybe :+: Either Int :+: (,) Bool :+: []

In this silly example, Something a might be any of the following:

  • Maybe a
  • Either Int a
  • (Bool, a)
  • [a]

but we can’t be sure which. We will arbitrary decide that (:+:) is right-associative – although it doesn’t matter in principle (sums are monoidal), part of our implementation will depend on this fact.

Given a moment, if you’re familiar with Haskell, you can probably figure out what the machinery must look like:

data (f :+: g) a = InL (f a) | InR (g a) deriving Functor infixr 8 :+:

(:+:) essentially builds a tree of data types, and then you use some combination of InL and InR to find the right part of the tree to use.

However, in practice, this becomes annoyingly painful and tedious; adding new data types can completely shuffle around your internal tree structure, and unless you’re careful, things that used to compile will no longer.

But fear not! Swierstra has got us covered!

class (Functor sub, Functor sup) => sub :<: sup where inj :: sub a -> sup a

This class (and its instances) say that f :<: fs means that the data type f is nestled somewhere inside of the big sum type fs. Furthermore, it gives us a witness to this fact, inj, which lifts our small data type into our larger one. With some clever instances of this typeclass, inj will expand to exactly the right combination of InL and InRs.

These instances are:

instance Functor f => f :<: f where inj = id instance (Functor f, Functor g) => f :<: (f :+: g) where inj = InL instance {-# OVERLAPPABLE #-} (Functor f, Functor g, Functor h, f :<: g) => f :<: (h :+: g) where inj = InR . inj

The first one states “if there’s nowhere left to go, we’re here!”. The second: “if our desired functor is on the left, use InL”. The third is: “otherwise, slap a InR down and keep looking”.

And so, we can now write our smart constructors in the style of:

change :: (ChangeF :<: fs) => Character -> ChangeType -> Free fs ChangeResult change c ct = liftF . inj $ Change c ct id

which will create a Change constructor in any data type which supports it (witnessed by ChangeF :<: fs).

Astute readers will notice immediately that the structural induction carried out by (:<:) won’t actually find the desired functor in any sum tree which isn’t right-associative, since it only ever recurses right. This unfortunate fact means that we must be very careful when defining DTalC in terms of type aliases.

In other words: we must adhere to a strict convention in order to ensure our induction will work correctly.

Better Data Types a la Carte

The problem, of course, is caused by the fact that DTalC can be constructed in ways that the structural induction can’t handle. Let’s fix that by constraining how DTalCs are constructed.

At the same time, we’ll add the missing inverse of inj, namely outj :: (f :<: fs) => fs a -> Maybe (f a)3, which we’ll need later to remove constructors, but isn’t fundamentally restricted in Swiestra’s method.

On the surface, our structural induction problem seems to be that we can only find data types in right-associative trees. But since right-associative trees are isomorphic to lists, the real flaw is that we’re not just using lists in the first place.

With the help of {-# LANGUAGE DataKinds #-}, we can lift lists (among other term-level things) to the type level. Additionally, using {-# LANGUAGE TypeFamilies #-}, we’re able to write type-level functions – functions which operate on and return types!

We define a type class with an associated data family:

class Summable (fs :: [* -> *]) where data Summed fs :: * -> *

Here fs is a type, as is Summed fs. Take notice, however, of the explicit kind annotations: fs is a list of things that look like Functors, and Summed fs looks like one itself.

Even with all of our fancy language extensions, a type class is still just a type class. We need to provide instances of it for it to become useful. The obvious case is if fs is the empty list:

instance Summable '[] where data Summed '[] a = SummedNil Void deriving Functor

The funny apostrophe in '[] indicates that what we’re talking about is an empty type-level list, rather than the type-constructor for lists. The distinction is at the kind level: '[] :: [k] for all kinds k, but [] :: * -> *.

What should happen if we try to join zero data types together? This is obviously crazy, but since we need to define it to be something we make it wrap Void. Since Void doesn’t have any inhabitants at the term-level, it is unconstructible, and thus so too is SummedNil.

But what use case could an unconstructible type possibly have? By itself, nothing, but notice that Either a Void must be Right a, since the Left branch can never be constructed. Now consider that Either a (Either b Void) is isomorphic to Either a b, but has the nice property that its innermost data constructor is always Left (finding the a is Left, and finding b is Right . Left).

Let’s move to the other case for our Summable class – when fs isn’t empty:

instance Summable (f ': fs) where data Summed (f ': fs) a = Here (f a) | Elsewhere (Summed fs a)

Summed for a non-empty list is either Here with the head of the list, or Elsewhere with the tail of the list. For annoying reasons, we need to specify that Summed (f ': fs) is a Functor in a rather obtuse way:

instance Summable (f ': fs) where data Summed (f ': fs) a = Functor f => Here (f a) | Elsewhere (Summed fs a) {-# LANGUAGE StandaloneDeriving #-} deriving instance Functor (Summed fs) => Functor (Summed (f ': fs))

but this now gives us what we want. Summed fs builds a nested sum-type from a type-level list of data types, and enforces (crucially, not by convention) that they form a right-associative list. We now turn our attention to building the inj machinery a la Data Types a la Carte:

class Injectable (f :: * -> *) (fs :: [* -> *]) where inj :: f a -> Summed fs a

We need to write instances for Injectable. Note that there is no instance Injectable '[] fs, since Summable '[] is unconstructible.

instance Functor f => Injectable f (f ': fs) where inj = Here instance {-# OVERLAPPABLE #-} Injectable f fs => Injectable f (g ': fs) where inj = Elsewhere . inj

These instances turn out to be very inspired by the original DTalC. This should come as no surprise, since the problem was with our construction of (:+:) – which we have now fixed – rather than our induction on (:<:).

At this point, we could define an alias between f :<: fs and Injectable f fs, and call it a day with guaranteed correct-by-construction data types a la carte, but we’re not quite done yet.

Remember, the original reason we dived into all of this mumbo jumbo was in order to remove data constructors from our DTalCs. We can’t do that yet, so we’ll need to set out on our own.

We want a function outj :: Summed fs a -> Maybe (f a) which acts as a prism into our a la carte sum types. If our Summed fs a is constructed by a f a, we should get back a Just – otherwise Nothing. We define the following type class:

class Outjectable (f :: * -> *) (fs :: [* -> *]) where outj :: Summed fs a -> Maybe (f a)

with instances that again strongly resemble DTalC:

instance Outjectable f (f ': fs) where outj (Here fa) = Just fa outj (Elsewhere _) = Nothing instance {-# OVERLAPPABLE #-} Outjectable f fs => Outjectable f (g ': fs) where outj (Here _ ) = Nothing outj (Elsewhere fa) = outj fa

The first instance says, “if what I’m looking for is the head of the list, return that.” The other says, “otherwise, recurse on an Elsewhere, or stop on a Here.”

And all that’s left is to package all of these typeclasses into something more easily pushed around:

class ( Summable fs , Injectable f fs , Outjectable f fs , Functor (Summed fs) ) => (f :: * -> *) :<: (fs :: [* -> *]) instance ( Summable fs , Injectable f fs , Outjectable f fs , Functor (Summed fs) ) => (f :<: fs)

This is a trick I learned from Edward Kmett’s great talk on Monad Homomorphisms, in which you build a class that has all of the right constraints, and then list the same constraints for an instance of it. Adding the new class as a constraint automatically brings all of its dependent constraints into scope; f :<: fs thus implies Summable fs, Injectable f fs, Outjectable f fs, and Functor (Summed fs) in a much more terse manner.

As a good measure, I wrote a test that outj is a left-inverse of inj:

injOutj_prop :: forall fs f a. (f :<: fs) => Proxy fs -> f a -> Bool injOutj_prop _ fa = isJust $ (outj (inj fa :: Summed fs a) :: Maybe (f a)) {-# LANGUAGE TypeApplications #-} main = quickCheck (injOutj_prop (Proxy @'[ [] , Proxy , Maybe , (,) Int ]) :: Maybe Int -> Bool)

where we use the Proxy fs to drive type checking for the otherwise hidden fs from the type signature in our property.

And there you have it! Data types a la carte which are guaranteed correct-by-construction, which we can automatically get into and out of. In the next post we’ll look at how rewriting our command functor in terms of DTalC solves all of our Interrupt-related headaches.

A working version of all this code together can be found on my GitHub repository.

  1. EDIT 2016-09-14: After re-reading the paper, it turns out that it describes (though doesn’t implement) this functionality.

  2. For the uninitiated, kinds are to types as types are to values – a kind is the “type” of a type. For example, Functor has kind * -> * because it doesn’t become a real type until you apply a type to it (Functor Int is a type, but Functor isn’t).

  3. I couldn’t resist the fantastic naming opportunity.

</article>
Categories: Offsite Blogs

Ketil Malde: HP EliteBook Folio G1

Planet Haskell - Mon, 09/12/2016 - 4:00am

I got a new laptop the other day. My aging Thinkpad X220 was beginning to wear out, keys were falling off and the case was starting to crack here and there. Sometimes the system log would contain ominous messages about temperatures and such. So it was finally replaced with a new laptop, a HP EliteBook Folio G1.

EliteBook Folio G1 as presented by HP.

This is a Mac-a-like thin little thing, very shiny and sleek compared to the Thinkpad's rough and rugged exterior. They are both "business models", where the Thinkpad has a look that clearly means business, the HP has more the "executive" feel, and there is probably a matching tie pin and cuff links somewhere near the bottom of the very elaborate cardboard package.

I took care to order the model with the non-touch, full-HD display, and not the one with a 4K touch display. I don't really see the point of a touch display on a PC, and I'd rather have the lighter weight, less reflective screen, and longer battery life. And it's only 12 inches or so, how small do pixels need to get, really? Full HD is something like 180 dpi, I'm pretty sure it suffices.

Did I mention shiny? I was surprised, and somewhat disappointed to unpack it and discover that the display is indeed very reflective. It's nice and bright (especially compared to the totally mediocre display on the old Lenovo), but you still need to dress like Steve Jobs (i.e. black sweater) to see anything else than your own reflection. My suspicions were further aroused when I discovered the screen actually was a touch screen, after all. I then weighed the thing, and true enough, it is ten percent heavier than it should be.

After going back and checking the model number (which was correct) and the description at the seller (which was clearly wrong), it was made clear that - although they had given the wrong specifications, a return was not possible. So that's that, I guess, I'm stuck with a heavier laptop with a less usable screen than I thought I would get. So be warned, model has a reflective touch display, even though specs don't say so.

The rest of the hardware is okay, for the most part. It's still fairly light at just a bit over one kg, and apart from its two USB-C ports, the only other connector is an audio jack. The chicklet keyboard is...not very good, compared to the Lenovo's, but I guess I'll learn to accept it. At least it's better than Apple's very shallow version. Oh, and I really like the fanless design. It's not a quiet PC - it's a silent one, and the only moving parts are the keyboard keys.

Apple may have more fanatic users, but they too don't much care for the Mac built-in keyboard.

Installing Linux

As an old-time Linux user, of course I wanted to install my favorite distribution. My previous computer had Debian installed, but I thought I'd try the latest Ubuntu for a (very small) change. Since I am particular in what software I use, and have little use for complex "desktop environments", my usual modus operandi is to install a minimal system -- meaning a Ubuntu server install -- and then pull whatever software I need.

So I copied the server install image to a USB stick (it took me ages of meddling with specialized and buggy USB stick creator software before I realized you can simply dump an ISO directly with something like ), and it happily booted, installed, and rebooted. Only to discover that networking is not supported. Not the built-in wifi, nor the USB-brick's ethernet. Really, Ubuntu?

Microsoft Windows tries to "repair" my Linux install, but gave up after a long period of trying.

Disappointed, I decided to try Debian, which provides a 'netinst' image. Unfortunately, the computer's UEFI copy protection fascist control skynet terminator system kicked in, and Debian images are non grata. So booting into the BIOS, meddle around with UEFI settings and "secure boot", to get this to work. For some reason, I needed to do this several times before it would take, and although I'm pretty sure the BIOS initially identified my USB stick by manufacturer and model, it now gave some generic description. It also wouldn't boot it directly, but there was a "boot from file" option I could use. But of course, installation failed here as well, network configuration didn't work.

Back to Ubuntu, the desktop image this time. And although I still needed to "boot from file" and dig around for an EFI image (or some such), it now managed to configure the network properly, and the install went pretty smoothly from there. I am now running Ubuntu 16.04, and wondering why the server image doesn't come with the same drivers as the desktop image. As long as the network is up, everything else can be fixed - and unless you are installing from a stack of DVDs, something I vaguely remember doing back in a previous century, without networking, you can't get anywhere.

Software

Not being a fan of desktop environments, I prefer to run the Xmonad window manager. After -ting it, I can select it at login, and things work the way they should.

That is: almost.

Since I don't have a graphical tool to inspect or configure networking, I previously used for this. Running this in the background (as root, of course), it reads a config file where all my networks are configured. When it connects and authenticates to a network, I can then run to obtain IP and things like DNS server.

My initial attempt at this config failed, and I tried instead. That didn't work too well last time I tried, which I think is the primary reason I used . The command connects with the NetworkManager daemon, and allows me to connect to networks with something like:

nmcli dev wifi connect my_ssid password my_psk_key

NetworkManager stores this information in , not unlike the config file, and after adding things once, it switches automatically - and unlike , also between other network interfaces. And it uses to proxy DNS queries, also in a smart manner (as far as I can tell).

A new install is also a good time to revisit configurations. I had a setup where XMonad's terminal command would bring up an

uxterm +sb -font '-*-fixed-medium-r-*-*-14-*-*-*-*-*-*-*'

but that font is way too small, and doesn't look very good. After some googling, I ended up with XTerm and specifically:

xterm -fn 7x13 -fa 'Liberation Mono:size=12:antialias=false' +rv

which I think looks pretty good. (Not sure why I have the antialias option, changing it to 'true' doesn't appear to change things, and fonts look to be antialiased regardless)

Docking

I got a docking station as well. Unlike the old Lenovo, where you press the whole PC down in a specialized dock, the new one is just a smallish brick with a bunch of connectors. It hooks up to the PC with a single USB-C cable, which supplies both power, and...well, USB. Running shows some devices, including a virtual disk, an ethernet interface, and something called a DisplayLink.

Bus 004 Device 003: ID 17e9:4354 DisplayLink

Ethernet and USB appears to work out of the box, but the external monitor doesn't show up, and 'xrandr' only displays the built-in display. Googling "displaylink" quickly brings one to http://www.displaylink.com/ which also have downloadable Linux drivers for Ubuntu. Distributed as a self-extracting shell script, haven't seen one of those for a while. First I needed to , to install support for building kernel modules, and after that, the display link driver made the external monitor available to xrandr configuration - and rescaled my screen and cloned it to the external display. So far so good.

There is a downside, however. The external display introduces a noticeable lag, and the DisplayLink daemon often hogs the CPU quite a bit. I also think it incurs a cost to graphically intensive programs, but so far I only noticed that chromium appears more CPU-greedy, so I could be wrong. There's a bunch of "WARNING" dumps in syslog, in addition to messages like this:

[107073.458870] evdi: [W] collapse_dirty_rects:102 Not enough space for clip rects! Rects will be collapsed

so I find it a bit hard to entirely trust this system. But at least it works.

Battery

Battery was supposed to be reasonable. Much of this depends on the CPU being able to enter sleep states. The obvious thing to do is to use 'powertop', go to the rightmost tab, and enable all kinds of hardware and device power saving. But userspace is also important, and I find that my web browser - typically with tens of tabs open on sites peppered with javascript and flash and whatnot - tend to keep the system very busy. Chromium consists of a giant cluster of threads, but as I start it from a terminal, I just bring it to the foreground and hit ctrl-Z.

Having done that, I recheck powertop, and apparently the wifi interface and the bluetooth keep interrupting the system. I don't think the function button for disabling wireless works, but running brings the power use down from 6W (about five hours) to 5.8W (five and a half)1. on the DisplayLink process brought it down to 4.4 (over seven hours), but then bounced back to 5.4 (6 hours). Dimming the backlight gets it down to 4.2W - but, okay, a dark screen and a PC doing nothing - how useful is that? In the end, what matters is how long a flight I can take and still expect to be able to use my computer, and only experience will tell.

And one nice thing about USB-C is that with a universal standard, chances are someone nearby will have a compatible charger - much like for mobile phones these days. And another nice thing is that it is already possible to buy external battery packs with USB-C that will keep your computer juiced up even for really long haul flights.

In the end?

Some things are still not working (e.g. special keys to adjust display brightness or switch off wireless - strangely, the key to dim keyboard backlight works), and other things are funky or unpredictable (e.g. that you must use the "fn" key to use function keys F1 to F12, or the direction two-finger scroll works). But by and large, things work pretty well, really.

  1. And reinserting the iwlwifi module, the device pops back into existence, and after a few seconds, NetworkManager has me reconnected. Nice!

Categories: Offsite Blogs

Michael Snoyman: Monads are like Lannisters

Planet Haskell - Sun, 09/11/2016 - 6:00pm

As many people are likely aware, monads (incorrectly) have a bad rap in the programming community for being difficult to learn. A string of extremely flawed monad tutorials based on analogies eventually led to a blog post by Brent Yorgey about the flaws of this analogy-based approach. And we've seen great learning materials on Haskell and monads.

However, I'm disappointed to see the analogy-based route disappear. Based on a recent Twitter poll I ran, which is held to the highest levels of scientific and statistical scrutiny, it's obvious that there is very high demand for a rigorous analogy-based monad tutorial. I claim that the flaw in all previous analogy based tutorials is lack of strong pop culture references. Therefore, I'm happy to announce the definitive guide to monads: Monads are like Lannisters.

Spoiler alert: if you haven't completed book 5 or season 6 of Haskell yet, there will be spoilers.

Prereqs

The examples below will all be Haskell scripts that can be run with the Stack build tool. Please grab Stack to play along. Copy-paste the full example into a file like foo.hs and then run it with stack foo.hs. (This uses Stack's script interpreter support.)

Hear me roar

Many people believe that the Lannister house words are "A Lannister always pays his debts" (or her debts of course). We'll get to that line in a moment. This belief however is false: the true Lannister house words are Hear me roar. So let's hear a Lannister roar:

#!/usr/bin/env stack -- stack --resolver lts-6.15 --install-ghc runghc roar :: IO () roar = putStrLn "Roar" main :: IO () main = do roar -- Tywin roar -- Cersei roar -- Jaime roar -- Tyrion

Roaring is clearly an output, and therefore it makes sense that our action is an IO action. But roaring doesn't really do much besides making sound, so its return is the empty value (). In our main function, we use do notation to roar multiple times. But we can just as easily use the replicateM_ function to replicate a monadic action multiple times and discard (that's what the _ means) the results:

#!/usr/bin/env stack -- stack --resolver lts-6.15 --install-ghc runghc import Control.Monad (replicateM_) roar :: IO () roar = putStrLn "Roar" main :: IO () main = replicateM_ 4 roarTyrion, the scholar

As we all know, Tyrion is a prolific scholar, consuming essentially any book he can get his hands on (we'll discuss some other consumption next). Fortunately, monads are there to back him up, with the Reader monad. Let's say that Tyrion is doing some late night research on wine production (epic foreshadowment) in various kingdoms, and wants to produce a total:

#!/usr/bin/env stack -- stack --resolver lts-6.15 --install-ghc runghc import Data.Map (Map) import qualified Data.Map as Map import Control.Monad.Trans.Reader import Data.Maybe (fromMaybe) type Kingdom = String type WineCount = Int type WineData = Map Kingdom WineCount tyrionResearch :: Reader WineData Int tyrionResearch = do mnorth <- asks $ Map.lookup "north" mriverlands <- asks $ Map.lookup "riverlands" return $ fromMaybe 0 mnorth + fromMaybe 0 mriverlands main :: IO () main = print $ runReader tyrionResearch $ Map.fromList [ ("north", 5) , ("riverlands", 10) , ("reach", 2000) , ("dorne", 1000) ]

While Tyrion may have chosen inferior kingdoms for wine production, it does not take away from the fact that the Reader type has allowed him access to data without having to explicitly pass it around. For an example this small (no dwarf joke intended), the payoff isn't great. But Reader is one of the simplest monads, and can be quite useful for larger applications.

Tyrion, the drinker

Let's try another one. We all know that Tyrion also likes to drink wine, not just count it. So let's use our Reader monad to give him access to a bottle of wine.

However, unlike reading data from a book, drinking from a bottle actually changes the bottle. So we have to leave our pure Reader world and get into the ReaderT IO world instead. Also known as monad transformers. (Unfortunately I don't have time now for a full post on it, but consider: Transformers: Monads in Disguise.)

#!/usr/bin/env stack -- stack --resolver lts-6.15 --install-ghc runghc import Control.Monad (replicateM_) import Control.Monad.Trans.Reader import Control.Monad.IO.Class import Control.Concurrent.Chan data Wine = Wine type Bottle = Chan Wine drink :: MonadIO m => Bottle -> m () drink bottle = liftIO $ do Wine <- readChan bottle putStrLn "Now I'm slightly drunker" tyrionDrinks :: ReaderT Bottle IO () tyrionDrinks = replicateM_ 10 $ ReaderT drink main :: IO () main = do -- Get a nice new bottle bottle <- newChan -- Fill up the bottle replicateM_ 20 $ writeChan bottle Wine -- CHUG! runReaderT tyrionDrinks bottleA Lannister always pays his debts

What is a debt, but receiving something from another and then returning it? Fortunately, there's a monad for that too: the State monad. It lets us take in some value, and then give it back - perhaps slightly modified.

#!/usr/bin/env stack -- stack --resolver lts-6.15 --install-ghc runghc import Control.Monad.Trans.State type Sword = String forge :: State Sword Sword forge = do "Ice" <- get -- do our Valyrian magic, and... -- Repay our debt to Brienne put "Oathkeeper" -- And Tywin gets a sword too! return "Widows Wail" killNedStark :: IO Sword killNedStark = do putStrLn "Off with his head!" return "Ice" main :: IO () main = do origSword <- killNedStark let (forTywin, forBrienne) = runState forge origSword putStrLn $ "Tywin received: " ++ forTywin putStrLn $ "Jaime gave Brienne: " ++ forBrienne

Not exactly justice, but the types have been satisfied! A monad always pays its debts.

Throwing

Monads are also useful for dealing with exceptional cases:

#!/usr/bin/env stack -- stack --resolver lts-6.15 --install-ghc runghc import Control.Exception import Data.Typeable (Typeable) data JaimeException = ThrowBranFromWindow deriving (Show, Typeable) instance Exception JaimeException jaime :: IO () jaime = do putStrLn "Did anyone see us?" answer <- getLine if answer == "no" then putStrLn "Good" else throwIO ThrowBranFromWindow main :: IO () main = jaimeKilling

And thanks to asynchronous exceptions, you can also kill other threads with monads.

#!/usr/bin/env stack -- stack --resolver lts-6.15 --install-ghc runghc import Control.Concurrent import Control.Exception import Control.Monad import Data.Typeable (Typeable) data JaimeException = Defenestrate deriving (Show, Typeable) instance Exception JaimeException bran :: IO () bran = handle onErr $ forever $ do putStrLn "I'm climbing a wall!" threadDelay 100000 where onErr :: SomeException -> IO () onErr ex = putStrLn $ "Oh no! I've been killed by: " ++ show ex jaime :: ThreadId -> IO () jaime thread = do threadDelay 500000 putStrLn "Oh, he saw us" throwTo thread Defenestrate threadDelay 300000 putStrLn "Problem solved" main :: IO () main = do thread <- forkIO bran jaime thread

Exercise for the reader: modify bran so that he properly recovers from that exception and begins warging instead. (WARNING: don't recover from async exceptions in practice.)

Exiting

You can also exit your entire process with monads.

#!/usr/bin/env stack -- stack --resolver lts-6.15 --install-ghc runghc import System.Exit tommen :: IO () tommen = do putStrLn "Oh, my dear wife!" exitFailure main :: IO () main = tommenPassing the baton of reign

We've seen Lannisters pass the baton of reign from family member to family member. We've also seem them ruthlessly destroy holy insitutions and have their private, internal affairs exposed. As it turns out, we can do all of that with some explicit state token manipulation!

#!/usr/bin/env stack -- stack --resolver lts-6.15 --install-ghc runghc {-# LANGUAGE MagicHash #-} {-# LANGUAGE UnboxedTuples #-} import GHC.Prim import GHC.Types joffrey :: IO () joffrey = do putStrLn "Look at your dead father's head Sansa!" putStrLn "Oh no, poisoned!" tommen :: IO () tommen = do putStrLn "I'm in love!" putStrLn "*Swan dive*" cersei :: IO () cersei = undefined -- season 7 isn't out yet unIO :: IO a -> State# RealWorld -> (# State# RealWorld, a #) unIO (IO f) = f main :: IO () main = IO $ \s0 -> case unIO joffrey s0 of (# s1, () #) -> case unIO tommen s1 of (# s2, () #) -> unIO cersei s2Honorable mentions

There's much more to be done with this topic, and there were other ideas besides Lannisters for this post. To give some mention for other ideas:

  • There's plenty of other Lannister material to work with (skinning animals, Jaime and Cersei affairs, or Tyrion proclivities). But I had to draw the line somewhere (both for length of post and topics I felt like discussing...). Feel free to comment about other ideas.
  • One of my favorites: monads are like onions
  • Monads are just your opinion, man
  • "they're like Gargoyles, they decorate the public facing interface of important landmarks in Haskell-land." link

Personally, I was going to go with a really terrible "monads are like printers" based on them taking plain paper in and spitting out printed pages, but Lannisters was a lot more fun. Also, I'm sure there's something great to be done with presidential candidates, at the very least throwTo opponent Insult.

Disclaimer

Since I'm sure someone is going to get upset about this: YES, this post is meant entirely as parody, and should not be used for actually learning anything except some random details of Game of Thrones, and perhaps a bit of Haskell syntax. Please use the learning materials I linked at the beginning for actually learning about Haskell and monads. And my real advice: don't actually "learn monads," just start using Haskell and you'll pick them up naturally.

Categories: Offsite Blogs

Philip Wadler: Option A vs B: Kicked into the long grass

Planet Haskell - Sun, 09/11/2016 - 1:50pm


I've been putting off posting about the final outcome of the West-East Cycle Route confrontation at Edinburgh City Council's Transport and Environment committee, in part because there was no final outcome.

I sat through four hours of Edinburgh Council's Transport and Environment Committee. The end result was a fudge. The Council heard adamant depositions from both sides, and decided not to decide. They will form a stakeholder group, with representatives from all involved, and attempt to come to a compromise that satisfies all sides. But it's hard to see how that can happen: if we don't get a straight route it will be a disaster, but if we do folk in Roseburn will fear an impact on their businesses (studies show cycle routes don't harm business, but they seem to be taking the Gove line: they are tired of hearing from experts).

Though perhaps we were lucky to get a fudge. Had it gone to a straight vote, following the whip Greens and Labour would have voted for Option A (7 votes) while the SDP, Conservatives, and Liberal Democrats would have voted for Option B (8 votes). As it was, the Greens gave a rousing defense of Option A, with fantastic speeches from Nigel Bagshaw and Chas Booth. Everyone else supported the 'compromise', and took the opportunity to excoriate the Greens for taking a principled stand. Even those who would have voted for Option B expressed support for cycling, and perhaps that is something we can build on.

Dave duFeu of Spokes posted an excellent summary, including links to the webcast of the meeting and a list of the steps you can take to support the West-East Cycle Route.

The Edinburgh Evening News offered coverage in the two days before: Cyclists take to street to support £6m cycle “superhighway” and D-Day for cycleway in Edinburgh amidst anger and division. The first of these features a couple of photos of me arguing with opponents of the cycle route in Roseburn. I don't look calm and collected. Oddly, I can't find any coverage the Evening News gave to the outcome of the Transport and Environment committee meeting. Was there any?

Not only are Roseburn residents up in arms. A similar cycleway in Bearsden, near Glasgow, is attracting comparable ire from its locals. What is our best way forward to fight bicycle bigots?

Previously:   Option A: Think about the children
  Roseburn to Leith Walk A vs B: time to act!
  Ride the Route in support of Option A
Categories: Offsite Blogs

Philip Wadler: What is it like to understand advanced mathematics?

Planet Haskell - Sun, 09/11/2016 - 11:31am

A correspondent on Quora explains the insider's view to an outsider. Some selections:

  • You can answer many seemingly difficult questions quickly. But you are not very impressed by what can look like magic, because you know the trick. The trick is that your brain can quickly decide if a question is answerable by one of a few powerful general purpose "machines"  (e.g., continuity arguments, the correspondences between geometric and algebraic objects, linear algebra, ways to reduce the infinite to the finite through various forms of compactness) combined with specific facts you have learned about your area. The number of fundamental ideas and techniques that people use to solve problems is, perhaps surprisingly, pretty small -- seehttp://www.tricki.org/tri<wbr></wbr>cki/map for a partial list, maintained by Timothy Gowers.
  • You go up in abstraction, "higher and higher". The main object of study yesterday becomes just an example or a tiny part of what you are considering today. For example, in calculus classes you think about functions or curves. In functional analysis or algebraic geometry, you think of spaces whose pointsare functions or curves -- that is, you "zoom out" so that every function is just a point in a space, surrounded by many other "nearby" functions. Using this kind of zooming out technique, you can say very complex things in short sentences -- things that, if unpacked and said at the zoomed-in level, would take up pages. Abstracting and compressing in this way makes it possible to consider extremely complicated issues with one's limited memory and processing power.
  • You are easily annoyed by imprecision in talking about the quantitative or logical. This is mostly because you are trained to quickly think about counterexamples that make an imprecise claim seem obviously false.
Categories: Offsite Blogs

Philip Wadler: ISDS enables corruption

Planet Haskell - Sun, 09/11/2016 - 11:22am

I've long known that ISDS (Investor-State Dispute Settlement) is one of the worst aspects of TTIP, TTP, and CETA. But I've thought the main problem with ISDS was first, that it enabled businessmen to sue governments over laws enacted to save their citizenry (as in the cartoon below), and, second, its secrecy. What I did not understand was how it enables corruption. Kudos to Buzzfeed for their detailed investigation.
Say a nation tries to prosecute a corrupt CEO or ban dangerous pollution. Imagine that a company could turn to this super court and sue the whole country for daring to interfere with its profits, demanding hundreds of millions or even billions of dollars as retribution.

Imagine that this court is so powerful that nations often must heed its rulings as if they came from their own supreme courts, with no meaningful way to appeal. That it operates unconstrained by precedent or any significant public oversight, often keeping its proceedings and sometimes even its decisions secret. That the people who decide its cases are largely elite Western corporate attorneys who have a vested interest in expanding the court’s authority because they profit from it directly, arguing cases one day and then sitting in judgment another. That some of them half-jokingly refer to themselves as “The Club” or “The Mafia.”

And imagine that the penalties this court has imposed have been so crushing — and its decisions so unpredictable — that some nations dare not risk a trial, responding to the mere threat of a lawsuit by offering vast concessions, such as rolling back their own laws or even wiping away the punishments of convicted criminals. This system is already in place, operating behind closed doors in office buildings and conference rooms in cities around the world. Known as investor-state dispute settlement, or ISDS, it is written into a vast network of treaties that govern international trade and investment, including NAFTA and the Trans-Pacific Partnership, which Congress must soon decide whether to ratify.

These trade pacts have become a flashpoint in the US presidential campaign. But an 18-month BuzzFeed News investigation, spanning three continents and involving more than 200 interviews and tens of thousands of documents, many of them previously confidential, has exposed an obscure but immensely consequential feature of these trade treaties, the secret operations of these tribunals, and the ways that business has co-opted them to bring sovereign nations to heel.

The series starts today with perhaps the least known and most jarring revelation: Companies and executives accused or even convicted of crimes have escaped punishment by turning to this special forum. Based on exclusive reporting from the Middle East, Central America, and Asia, BuzzFeed News has found the following:
  • A Dubai real estate mogul and former business partner of Donald Trump was sentenced to prison for collaborating on a deal that would swindle the Egyptian people out of millions of dollars — but then he turned to ISDS and got his prison sentence wiped away.
  • In El Salvador, a court found that a factory had poisoned a village — including dozens of children — with lead, failing for years to take government-ordered steps to prevent the toxic metal from seeping out. But the factory owners’ lawyers used ISDS to help the company dodge a criminal conviction and the responsibility for cleaning up the area and providing needed medical care.
  • Two financiers convicted of embezzling more than $300 million from an Indonesian bank used an ISDS finding to fend off Interpol, shield their assets, and effectively nullify their punishment.

Categories: Offsite Blogs

LHC Team: Haskell Suite: Type inference.

Planet Haskell - Sat, 09/10/2016 - 3:39am
Disclaimer: I am not an academic. The following post is akin to a plumber's guide to brain surgery.

Type inference is complex and has evolved over time. In this post, I will try to explain how I see the landscape and where LHC and other Haskell compilers fit into this landscape.

The beginning: The Hindley–Milner type system. 1982.
The typing rules here are quite simple and every Haskeller seem to learn them intuitively. They include things like: if 'f :: A → B' and 'a :: A' then 'f a :: B'.
In this system, types are always inferred and there must always be a single, most general type for every expression. This becomes a problem when we want higher ranked types because here a single, most general type cannot be inferred. There may be many equally valid type solutions and it has to be up to the programmer to select the appropriate one. But this cannot happen in plain HM as type signatures are only used to make inferred types less general (eg. [a] was inferred but the programmer wanted [Int]).
Omitting the type signature in the following code can show us what plain HM would be like:
<script src="https://gist.github.com/Lemmih/3d9e64eb6b6dd4735e823e95094555e9.js"></script> In GHC, the snippet will run fine with the type signature but not without it.


Version two: Bidirectional type system. 2000.
People realised that checking the correctness of a given type is much easier than inferring a correct type. Armed with this knowledge, a new type checker was born that had two modes usually called 'up' and 'down'. The 'up' mode lifts a new correct type up from an expression and the 'down' mode that checks the correctness of a type. Because of these two modes, this kind of system was called bidirectional and it deals with higher ranked types quite well.
LHC current implements this.

Version three: Boxy types. 2006.
At this point it had become apparent that higher ranked types didn't really play well with higher order functions. People often found themselves in situations where slight, seemingly innocent changes caused the type-checker to reject their programs. An example of this can be seen in this gist:
<script src="https://gist.github.com/Lemmih/52d35e65b915a722bc377165109715ab.js"></script>Impredicative polymorphism is required for the above code and boxy types is a stab in that direction. Bidirectional type checking was a big improvement over plain HM but it lacked granularity. Types are either 100% inferred or 100% checked with no middle ground. What if you wanted to check parts of a type and infer the rest? Well, boxy types solves exactly that problem. Boxes are added (internally, we're not making changes to Haskell here) to types and they signify an unknown that should be inferred. Now parts of types can be checked while the boxes are inferred and we're left with the best of both worlds. This is what JHC implements, btw. Boxy types was also implemented in GHC but was deemed to be too complicated.

Version four: FPH, First-class Polymorphism for Haskell. 2008.
Impredicative polymorphism, second attempt from the authors of boxy types. Improvements were made but the problem is still not solved.

Version five: OutsideIn(X). 2011.
GHC is a hotbed for experimentation in type checkers. GADTs, multi-parameter type classes, type families. These are just some of the features that makes the type-checker the largest and most complicated component of GHC. To deal with all of this, researchers came up with OutsideIn, described in a paper longer than all the previous papers put together. The algorithm is relatively simple, but, for practical reasons, implementations must reject some programs that are valid according to the specification.

Categories: Offsite Blogs