News aggregator

FP Complete: Static compilation with Stack

Planet Haskell - Thu, 10/06/2016 - 8:00pm

In our last blog post we showed you the new docker init executable pid1. What if we wanted to use our shiny new pid1 binary on a CentOS Docker image but we compiled it on Ubuntu? The answer is that it wouldn't likely work. All Linux flavors package things up a little differently and with different versions and flags.

If we were to compile pid1 completely static it could be portable (within a given range of Linux kernel versions). Let's explore different ways to compile a GHC executable with Stack. Maybe we can come up with a way to create portable binaries.

Base Image for Experiments

First let's create a base image since we are going to be trying many different compilation scenarios.

Here's a Dockerfile for Alpine Linux & GHC 8.0 with Stack.

# USE ALPINE LINUX FROM alpine RUN apk update # INSTALL BASIC DEV TOOLS, GHC, GMP & ZLIB RUN echo "https://s3-us-west-2.amazonaws.com/alpine-ghc/8.0" >> /etc/apk/repositories ADD https://raw.githubusercontent.com/mitchty/alpine-ghc/master/mitch.tishmack%40gmail.com-55881c97.rsa.pub \ /etc/apk/keys/mitch.tishmack@gmail.com-55881c97.rsa.pub RUN apk update RUN apk add alpine-sdk git ca-certificates ghc gmp-dev zlib-dev # GRAB A RECENT BINARY OF STACK ADD https://s3.amazonaws.com/static-stack/stack-1.1.2-x86_64 /usr/local/bin/stack RUN chmod 755 /usr/local/bin/stack

Let's build it and give it a tag.

docker build --no-cache=true --tag fpco/pid1:0.1.0-base .Default GHC Compilation

Next let's compile pid1 with default Stack & GHC settings.

Here's our minimalist stack.yaml file.

resolver: lts-7.1

Here's our project Dockerfile that extends our test base image above.

FROM fpco/pid1:0.1.0-base # COMPILE PID1 ADD ./ /usr/src/pid1 WORKDIR /usr/src/pid1 RUN stack --local-bin-path /sbin install --test # SHOW INFORMATION ABOUT PID1 RUN ldd /sbin/pid1 || true RUN du -hs /sbin/pid1

Let's compile this default configuration using Docker and give it a label.

docker build --no-cache=true --tag fpco/pid1:0.1.0-default .

A snippet from the Docker build showing the results.

Step 6 : RUN ldd /sbin/pid1 || true ---> Running in fcc138c199d0 /lib/ld-musl-x86_64.so.1 (0x559fe5aaf000) libgmp.so.10 => /usr/lib/libgmp.so.10 (0x7faff710b000) libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x559fe5aaf000) ---> 70836a2538e2 Removing intermediate container fcc138c199d0 Step 7 : RUN du -hs /sbin/pid1 ---> Running in 699876efeb1b 956.0K /sbin/pid1

You can see that this build results in a semi-static binary with a link to MUSL (libc) and GMP. This is not extremely portable. We will always have to be concerned about the dynamic linkage happening at run-time. This binary would probably not run on Ubuntu as is.

100% Static

Let's try compiling our binary as a 100% static Linux ELF binary without any link to another dynamic library. Note that our open source license must be compatible with MUSL and GMP in order to do this.

Let's try a first run with static linkage. Here's another Dockerfile that shows a new ghc-option to link statically.

FROM fpco/pid1:0.1.0-base # TRY TO COMPILE ADD ./ /usr/src/pid1 WORKDIR /usr/src/pid1 RUN stack --local-bin-path /sbin install --test --ghc-options '-optl-static'

Let's give it a go.

docker build --no-cache=true --tag fpco/pid1:0.1.0-static .

Oh no. It didn't work. Looks like there's some problem with linking. :|

[1 of 1] Compiling System.Process.PID1 ( src/System/Process/PID1.hs, .stack-work/dist/x86_64-linux/Cabal-1.24.0.0/build/System/Process/PID1.o ) /usr/lib/gcc/x86_64-alpine-linux-musl/5.3.0/../../../../x86_64-alpine-linux-musl/bin/ld: /usr/lib/gcc/x86_64-alpine-linux-musl/5.3.0/crtbeginT.o: relocation R_X86_64_32 against `__TMC_END__' can not be used when making a shared object; recompile with -fPIC /usr/lib/gcc/x86_64-alpine-linux-musl/5.3.0/crtbeginT.o: error adding symbols: Bad value collect2: error: ld returned 1 exit status `gcc' failed in phase `Linker'. (Exit code: 1) -- While building package pid1-0.1.0 using: /root/.stack/setup-exe-cache/x86_64-linux/setup-Simple-Cabal-1.24.0.0-ghc-8.0.1 --builddir=.stack-work/dist/x86_64-linux/Cabal-1.24.0.0 build lib:pid1 exe:pid1 --ghc-options " -ddump-hi -ddump-to-file" Process exited with code: ExitFailure 1PIC flag

OK that last error said we should recompile with -fPIC. Let's try that. Once again, here's a Dockerfile with the static linkage flag & the new -fPIC flag.

FROM fpco/pid1:0.1.0-base # TRY TO COMPILE ADD ./ /usr/src/pid1 WORKDIR /usr/src/pid1 RUN stack --local-bin-path /sbin install --test --ghc-options '-optl-static -fPIC'

Let's give it a try.

docker build --no-cache=true --tag fpco/pid1:0.1.0-static-fpic .

But we still get the error again.

[1 of 1] Compiling System.Process.PID1 ( src/System/Process/PID1.hs, .stack-work/dist/x86_64-linux/Cabal-1.24.0.0/build/System/Process/PID1.o ) /usr/lib/gcc/x86_64-alpine-linux-musl/5.3.0/../../../../x86_64-alpine-linux-musl/bin/ld: /usr/lib/gcc/x86_64-alpine-linux-musl/5.3.0/crtbeginT.o: relocation R_X86_64_32 against `__TMC_END__' can not be used when making a shared object; recompile with -fPIC /usr/lib/gcc/x86_64-alpine-linux-musl/5.3.0/crtbeginT.o: error adding symbols: Bad value collect2: error: ld returned 1 exit status `gcc' failed in phase `Linker'. (Exit code: 1) -- While building package pid1-0.1.0 using: /root/.stack/setup-exe-cache/x86_64-linux/setup-Simple-Cabal-1.24.0.0-ghc-8.0.1 --builddir=.stack-work/dist/x86_64-linux/Cabal-1.24.0.0 build lib:pid1 exe:pid1 --ghc-options " -ddump-hi -ddump-to-file" Process exited with code: ExitFailure 1crtbeginT swap

Searching around for this crtbegint linkage problem we find that if we provide a hack that it'll work correctly. Here's the Dockerfile with the hack.

FROM fpco/pid1:0.1.0-base # FIX https://bugs.launchpad.net/ubuntu/+source/gcc-4.4/+bug/640734 WORKDIR /usr/lib/gcc/x86_64-alpine-linux-musl/5.3.0/ RUN cp crtbeginT.o crtbeginT.o.orig RUN cp crtbeginS.o crtbeginT.o # COMPILE PID1 ADD ./ /usr/src/pid1 WORKDIR /usr/src/pid1 RUN stack --local-bin-path /sbin install --test --ghc-options '-optl-static -fPIC' # SHOW INFORMATION ABOUT PID1 RUN ldd /sbin/pid1 || true RUN du -hs /sbin/pid1

When we try it again

docker build --no-cache=true --tag fpco/pid1:0.1.0-static-fpic-crtbegint .

It works this time!

Step 8 : RUN ldd /sbin/pid1 || true ---> Running in 8b3c737c2a8d ldd: /sbin/pid1: Not a valid dynamic program ---> 899f06885c71 Removing intermediate container 8b3c737c2a8d Step 9 : RUN du -hs /sbin/pid1 ---> Running in d641697cb2a8 1.1M /sbin/pid1 ---> aa17945f5bc4

Nice. 1.1M isn't too bad for a binary that's portable. Let's see if we can make it smaller though. On larger executables, especially with other linked external libraries, this static output can be 50MB(!)

Optimal SizeGCC Optimization

It says on the GCC manpage if we use -Os that this will optimize for size. Let's try it.

Specify -optc-Os to optimize for size.

FROM fpco/pid1:0.1.0-base # FIX https://bugs.launchpad.net/ubuntu/+source/gcc-4.4/+bug/640734 WORKDIR /usr/lib/gcc/x86_64-alpine-linux-musl/5.3.0/ RUN cp crtbeginT.o crtbeginT.o.orig RUN cp crtbeginS.o crtbeginT.o # COMPILE PID1 ADD ./ /usr/src/pid1 WORKDIR /usr/src/pid1 RUN stack --local-bin-path /sbin install --test --ghc-options '-optl-static -fPIC -Os' # SHOW INFORMATION ABOUT PID1 RUN ldd /sbin/pid1 || true RUN du -hs /sbin/pid1 docker build --no-cache=true --tag fpco/pid1:0.1.0-static-fpic-crtbegint-optcos . Step 9 : RUN ldd /sbin/pid1 || true ---> Running in 8e28314924d0 ldd: /sbin/pid1: Not a valid dynamic program ---> c977f078eb24 Removing intermediate container 8e28314924d0 Step 10 : RUN du -hs /sbin/pid1 ---> Running in 4e6b5c4d87aa 1.1M /sbin/pid1 ---> 66d459e3fcc1

There isn't any difference in output size with this flag. You may want to try it on a little larger or more complex executable to see if it makes a difference for you.

Split Objects

GHC allows us to "split objects" when we compile Haskell code. That means each Haskell module is broken up into it's own native library. In this scenario, when we import a module, our final executable is linked against smaller split modules instead of to the entire package. This helps reduce the size of the executable. The trade-off is that it takes more time for GHC to compile.

resolver: lts-7.1 build: { split-objs: true } docker build --no-cache=true --tag fpco/pid1:0.1.0-static-fpic-crtbegint-optcos-split . Step 9 : RUN ldd /sbin/pid1 || true ---> Running in 8e28314924d0 ldd: /sbin/pid1: Not a valid dynamic program ---> c977f078eb24 Removing intermediate container 8e28314924d0 Step 10 : RUN du -hs /sbin/pid1 ---> Running in 4e6b5c4d87aa 1.1M /sbin/pid1 ---> 66d459e3fcc1

There isn't any difference in output size with this flag in this case. On some executables this really makes a big difference. Try it yourself.

UPX Compression

Let's try compressing our static executable with UPX. Here's a Dockerfile.

FROM fpco/pid1:0.1.0-base # FIX https://bugs.launchpad.net/ubuntu/+source/gcc-4.4/+bug/640734 WORKDIR /usr/lib/gcc/x86_64-alpine-linux-musl/5.3.0/ RUN cp crtbeginT.o crtbeginT.o.orig RUN cp crtbeginS.o crtbeginT.o # COMPILE PID1 ADD ./ /usr/src/pid1 WORKDIR /usr/src/pid1 RUN stack --local-bin-path /sbin install --test --ghc-options '-optl-static -fPIC -optc-Os' # COMPRESS WITH UPX ADD https://github.com/lalyos/docker-upx/releases/download/v3.91/upx /usr/local/bin/upx RUN chmod 755 /usr/local/bin/upx RUN upx --best --ultra-brute /sbin/pid1 # SHOW INFORMATION ABOUT PID1 RUN ldd /sbin/pid1 || true RUN du -hs /sbin/pid1

Build an image that includes UPX compression.

docker build --no-cache=true --tag fpco/pid1:0.1.0-static-fpic-crtbegint-optcos-split-upx .

And, wow, that's some magic.

Step 11 : RUN ldd /sbin/pid1 || true ---> Running in 69f86bd03d01 ldd: /sbin/pid1: Not a valid dynamic program ---> c01d54dca5ac Removing intermediate container 69f86bd03d01 Step 12 : RUN du -hs /sbin/pid1 ---> Running in 01bbed565de0 364.0K /sbin/pid1 ---> b94c11bafd95

This makes a huge difference with the resulting executable 1/3 the original size. There is a small price to pay in extracting the executable on execution but for a pid1 that just runs for the lifetime of the container, this is not noticeable.

Slackware Support

Here's a Slackware example running pid1 that was compiled on Alpine Linux

FROM vbatts/slackware ADD https://s3.amazonaws.com/download.fpcomplete.com/pid1/pid1-0.1.0-amd64 /sbin/pid1 RUN chmod 755 /sbin/pid1 ENTRYPOINT [ "/sbin/pid1" ] CMD bash -c 'while(true); do sleep 1; echo alive; done'

Build an image that includes UPX compression.

docker build -t fpco/pid1:0.1.0-example-slackware . docker run --rm -i -t fpco/pid1:0.1.0-example-slackware

It works!

alive alive alive ^C
Categories: Offsite Blogs

Brent Yorgey: ICFP roundup

Planet Haskell - Tue, 10/04/2016 - 9:36pm

ICFP 2016 in Nara, Japan was a blast. Here are a few of my recollections.

The Place

Although I was a coathor on an ICFP paper in 2011, when it was in Tokyo, I did not go since my son was born the same week. So this was my first time in Japan, or anywhere in Asia, for that matter. (Of course, this time I missed my son’s fifth birthday…)

I’ve been to Europe multiple times, and although it is definitely foreign, the culture is similar enough that I feel like I basically know how to behave. I did not feel that way in Japan. I’m pretty sure I was constantly being offensive without realizing it, but most of the time people were polite and accommodating.

…EXCEPT for that one time I was sitting in a chair chatting with folks during a break between sessions, with my feet up on a (low, plain) table, and an old Japanese guy WHACKED his walking stick on the table and shouted angrily at me in Japanese. That sure got my adrenaline going. Apparently putting your feet on the table is a big no-no, lesson learned.

The food was amazing even though I didn’t know what half of it was. I was grateful that I (a) am not vegetarian, (b) know how to use chopsticks decently well, and (c) am an adventurous eater. If any one of those were otherwise, things might have been more difficult!

On my last day in Japan I had the whole morning before I needed to head to the airport, so Ryan Yates and I wandered around Nara and saw a bunch of temples, climbed the hill, and such. It’s a stunningly beautiful place with a rich history.

The People

As usual, it’s all about the people. I enjoyed meeting some new people, including (but not limited to):

  • Pablo Buiras and Marco Vassena were my hotel breakfast buddies, it was fun getting to know them a bit.
  • I finally met Dominic Orchard, though I feel like I’ve known his name and known about some of his work for a while.
  • I don’t think I had met Max New before but we had a nice chat about the Scheme enumerations library he helped develop and combinatorial species. I hope to be able to follow up that line of inquiry.
  • As promised, I met everyone who commented on my blog post, including Jürgen Peters (unfortunately we did not get a chance to play go), Andrey Mokhov (who nerd-sniped me with a cool semiring-ish thing with some extra structure — perhaps that will be another blog post), and Jay McCarthy (whom I had actually met before, but we had some nice chats, including one in the airport while waiting for our flight to LAX).
  • I don’t think I had met José Manuel Calderón Trilla before; we had a great conversation over a meal together (along with Ryan Yates) in the Osaka airport while waiting for our flights.
  • I met Diogenes Nunez, who went to my alma mater Williams College. When I taught at Williams a couple years ago I’m pretty sure I heard Diogenes mentioned by the other faculty, so it was fun to get to meet him.
  • Last but certainly not least, I met my coauthor, Piyush Kurur. We collaborated on a paper through the magic of the Internet (Github in particular), and I actually met him in person for the first time just hours before he presented our paper!

My student Ollie Kwizera came for PLMW—it was fun having him there. I only crossed paths with him three or four times, but I think that was all for the best, since he made his own friends and had his own experiences.

Other people who I enjoyed seeing and remember having interesting conversations with include (but I am probably forgetting someone!) Michael Adams, Daniel Bergey, Jan Bracker, Joachim Breitner, David Christiansen, David Darais, Stephen Dolan, Richard Eisenberg, Kenny Foner, Marco Gaboardi, Jeremy Gibbons, John Hughes, David Janin, Neel Krishnaswami, Dan Licata, Andres Löh, Simon Marlow, Tom Murphy, Peter-Michael Osera, Jennifer Paykin, Simon Peyton Jones, Ryan Scott, Mary Sheeran, Mike Sperber, Luite Stegeman, Wouter Swierstra, David Terei, Ryan Trinkle, Tarmo Uustalu, Stephanie Weirich, Nick Wu, Edward Yang, and Ryan Yates. My apologies if I forgot you, just remind me and I’ll add you to the list! I’m amazed and grateful I get to know all these cool people.

The Content

Here are just a few of my favorite talks:

  • I’m a sucker for anything involving geometry and/or random testing and/or pretty pictures, and Ilya Sergey’s talk Growing and Shrinking Polygons for Random testing of Computational Geometry had them all. In my experience, doing effective random testing in any domain beyond basic functions usually requires some interesting domain-specific insights, and Ilya had some cool insights about ways to generate and shrink polygons in ways that were much more likely to generate small counterexamples for computational geometry algorithms.

  • Idris gets more impressive by the day, and I always enjoy David Christiansen’s talks.

  • Sandra Dylus gave a fun talk, All Sorts of Permutations, with the cute observation that a sorting algorithm equipped with a nondeterministic comparison operator generates permutations (though it goes deeper than that). During the question period someone asked whether there is a way to generate all partitions, and someone sitting next to me suggested using the group function—and indeed, I think this works. I wonder what other sorts of combinatorial objects can be enumerated by this method. In particular I wonder if quicksort with nondeterministic comparisons can be adapted to generate not just all permutations, but all binary trees.

  • I greatly enjoyed TyDe, especially Jeremy Gibbons’ talk on APLicative Programming with Naperian Functors (I don’t think the video is online yet, if there is one). I’ll be serving as co-chair of the TyDe program committee next year, so start thinking about what you would like to submit!

  • There were also some fun talks at FARM, for example, Jay McCarthy’s talk on Bithoven. But I don’t think the FARM videos are uploaded yet. Speaking of FARM, the performance evening was incredible. It will be hard to live up to next year.


Categories: Offsite Blogs

FP Complete: Docker demons: PID-1, orphans, zombies, and signals

Planet Haskell - Tue, 10/04/2016 - 8:00pm

There are a number of corner cases to consider when dealing with Docker, multiple processes, and signals. Probably the most famous post on this matter is from the Phusion blog. Here, we'll see some examples of how to see these problems first hand, and one way to work around it: fpco/pid1.

The Phusion blog post recommends using their baseimage-docker. This image provides a my_init entrypoint which handles the problems described here, as well as introducing some extra OS features, such as syslog handling. Unfortunately, we ran into problems with Phusion's usage of syslog-ng, in particular with it creating unkillable processes pegged at 100% CPU usage. We're still investigating the root cause, but in practice we have found that the syslog usage is a far less motivating case than simply a good init process, which is why we've created the pid1 Haskell package together with a simple fpco/pid1 Docker image.

This blog post is intended to be interactive: you'll get the most bang for your buck by opening up your terminal and running commands along with reading the text. It will be far more motivating to see your Ctrl-C completely fail to kill a process.

NOTE The primary reason we wrote our own implementation in Haskell was to be able to embed it within the Stack build tool. There are other lightweight init processes already available, such as dumb-init. I've also blogged about using dumb-init. While this post uses pid1, there's nothing specific to it versus other init processes.

Playing with entrypoints

Docker has a concept of entrypoints, which provides a default wrapping command for commands you provides to docker run. For example, consider this interaction with Docker:

$ docker run --entrypoint /usr/bin/env ubuntu:16.04 FOO=BAR bash c 'echo $FOO' BAR

This works because the above is equivalent to:

$ docker run ubuntu:16.04 /usr/bin/env FOO=BAR bash -c 'echo $FOO'

Entrypoints can be overridden on the command line (as we just did), but can also be specified in the Dockerfile (which we'll do later). The default entrypoint for the ubuntu Docker image is a null entrypoint, meaning that the provided command will be run directly without any wrapping. We're going to simulate that experience by using /usr/bin/env as an entrypoint, since switching entrypoint back to null isn't yet supported in released Docker. When you run /usr/bin/env foo bar baz, the env process will exec the foo command, making foo the new PID 1, which for our purposes gives it the same behavior as a null entrypoint.

Both the fpco/pid1 and snoyberg/docker-testing images we'll use below set /sbin/pid1 as the default entrypoint. In the example commands, we're explicitly including --entrypoint /sbin/pid1. This is just to be clear on which entrypoint is being used; if you exclude that option, the same behavior will persist.

Sending TERM signal to process

We'll start with our sigterm.hs program, which runs ps (we'll see why soon), then sends itself a SIGTERM and then loops forever. On a Unix system, the default process behavior when receiving a SIGTERM is to exit. Therefore, we'd expect that our process will just exit when run. Let's see:

$ docker run --rm --entrypoint /usr/bin/env snoyberg/docker-testing sigterm PID TTY TIME CMD 1 ? 00:00:00 sigterm 9 ? 00:00:00 ps Still alive! Still alive! Still alive! ^C $

The process ignored the SIGTERM and kept running, until I hit Ctrl-C (we'll see what that does later). Another feature in the sigterm code base, though, is that if you give it the command line argument install-handler, it will explicitly install a SIGTERM handler which will kill the process. Perhaps surprisingly, this has a significant impact on our application:

$ docker run --rm --entrypoint /usr/bin/env snoyberg/docker-testing sigterm install-handler PID TTY TIME CMD 1 ? 00:00:00 sigterm 8 ? 00:00:00 ps Still alive! $

The reason for this is some Linux kernel magic: the kernel treats a process with PID 1 specially, and does not, by default, kill the process when receiving the SIGTERM or SIGINT signals. This can be very surprising behavior. For a simpler example, try running the following commands in two different terminals:

$ docker run --rm --name sleeper ubuntu:16.04 sleep 100 $ docker kill -s TERM sleeper

Notice how the docker run command does not exit, and if you check your ps aux output, you'll see that the process is still running. That's because the sleep process was not designed to be PID 1, and does not install a special signal handler. To work around this problem, you've got two choices:

  1. Ensure every command you run from docker run has explicit handling of SIGTERM.
  2. Make sure the command you run isn't PID 1, but instead use a process that is designed to handle SIGTERM correctly.

Let's see how the sigterm program works with our /sbin/pid1 entrypoint:

$ docker run --rm --entrypoint /sbin/pid1 snoyberg/docker-testing sigterm PID TTY TIME CMD 1 ? 00:00:00 pid1 8 ? 00:00:00 sigterm 12 ? 00:00:00 ps

The program exits immediately, as we'd like. But look at the ps output: our first process is now pid1 instead of sigterm. Since sigerm is being launched as a different PID (8 in this case), the special casing from the Linux kernel does not come into play, and default SIGTERM handling is active. To step through exactly what happens in our case:

  1. Our container is created, and the command /usr/sbin/pid1 sigterm is run inside of it.
  2. pid1 starts as PID-1, does its business, and then fork/execs the sigterm executable.
  3. sigterm raises the SIGTERM signal to itself, causing it to die.
  4. pid1 sees that its child died from SIGTERM (== signal 15) and exits with exit code 143 (== 128 + 15).
  5. Since our PID1 is dead, our container dies too.

This isn't just some magic with sigterm, you can do the same thing with sleep:

$ docker run --rm --name sleeper fpco/pid1 sleep 100 $ docker kill -s TERM sleeper

Unlike with the ubuntu image, this will kill the container immediately, due to the /sbin/pid1 entrypoint used by fpco/pid1.

NOTE In the case of sigterm, which sends the TERM signal to itself, it turns out you don't need a special PID1 process with signal handling, anything will do. For example, try docker run --rm --entrypoint /usr/bin/env snoyberg/docker-testing /bin/bash -c "sigterm;echo bye". But playing with sleep will demonstrate the need for a real signal-aware PID1 process.

Ctrl-C: sigterm vs sleep

There's a slight difference between sigterm and sleep when it comes to the behavior of sending hitting Ctrl-C. When you use Ctrl-C, it sends a SIGINT to the docker run process, which proxies that signal to the process inside the container. sleep will ignore it, just as it ignores SIGTERM, due to the default signal handlers for PID1 in the Linux kernel. However, the sigterm executable is written in Haskell, and the Haskell runtime itself installs a signal handler that converts SIGINT into a user interrupt exception, overriding the PID1 default behavior. For more on signal proxying, see the docker attach documentation.

Reaping orphans

Suppose you have process A, which fork/execs process B. When process B dies, process A must call waitpid to get its exit status from the kernel, and until it does so, process B will be dead but with an entry in the system process table. This is known as being a zombie.

But what happens if process B outlives process A? In this case, process B is known as an orphan, and needs to be adopted by the init process, aka PID1. It is the init process's job to reap orphans so they do not remain as zombies.

The orphans.hs program will:

  • Spawn a child process, and then loop forever calling ps
  • In the child process: run the echo command a few times, without calling waitpid, and then exit

As you can see, none of the processes involved will reap the zombie echo processes. The output from the process confirms that we have, in fact, created zombies:

$ docker run --rm --entrypoint /usr/bin/env snoyberg/docker-testing orphans 1 2 3 4 Still alive! PID TTY TIME CMD 1 ? 00:00:00 orphans 8 ? 00:00:00 orphans 13 ? 00:00:00 echo <defunct> 14 ? 00:00:00 echo <defunct> 15 ? 00:00:00 echo <defunct> 16 ? 00:00:00 echo <defunct> 17 ? 00:00:00 ps Still alive! PID TTY TIME CMD 1 ? 00:00:00 orphans 13 ? 00:00:00 echo <defunct> 14 ? 00:00:00 echo <defunct> 15 ? 00:00:00 echo <defunct> 16 ? 00:00:00 echo <defunct> 18 ? 00:00:00 ps Still alive!

And so on until we kill the container. That <defunct> indicates a zombie process. The issue is that our PID 1, orphans, doesn't do reaping. As you probably guessed, we can solve this by just using the /sbin/pid1 entrypoint:

$ docker run --rm --entrypoint /sbin/pid1 snoyberg/docker-testing orphans 1 2 3 4 Still alive! PID TTY TIME CMD 1 ? 00:00:00 pid1 10 ? 00:00:00 orphans 14 ? 00:00:00 orphans 19 ? 00:00:00 echo <defunct> 20 ? 00:00:00 echo <defunct> 21 ? 00:00:00 echo <defunct> 22 ? 00:00:00 echo <defunct> 23 ? 00:00:00 ps Still alive! PID TTY TIME CMD 1 ? 00:00:00 pid1 10 ? 00:00:00 orphans 24 ? 00:00:00 ps Still alive!

pid1 now adopts the echo processes when the child orphans process dies, and reaps accordingly.

Surviving children

Let's try out something else: process A is the primary command for the Docker container, and it spawns process B. Before process B exits, process A exits, causing the Docker container to exit. In this case, the running process B will be forcibly closed by the kernel (see this Stack Overflow question for details). We can see this with our surviving.hs program

$ docker run --rm --entrypoint /usr/bin/env snoyberg/docker-testing surviving Parent sleeping Child: 1 Child: 2 Child: 4 Child: 3 Child: 1 Child: 2 Child: 3 Child: 4 Parent exiting

Unfortunately this doesn't give our child processes a chance to do any cleanup. Instead, we would rather send them a SIGTERM, and after a grace period send them a SIGKILL. This is exactly what pid1 does:

$ docker run --rm --entrypoint /sbin/pid1 snoyberg/docker-testing surviving Parent sleeping Child: 2 Child: 3 Child: 1 Child: 4 Child: 2 Child: 1 Child: 4 Child: 3 Parent exiting Got a TERM Got a TERM Got a TERM Got a TERMSignaling docker run vs PID1

When you run sleep 60 and then hit Ctrl-C, the sleep process itself receives a SIGINT. When you instead run docker run --rm fpco/pid1 sleep 60 and hit Ctrl-C, you may think that the same thing is happening. However, in reality, it's not at all the same. Your docker run call creates a docker run process, which sends a command to the Docker daemon on your machine, and that daemon creates the actual sleep process (inside a container). When you hit Ctrl-C on your terminal, you're sending SIGINT to docker run, which is in fact sending a command to the Docker daemon, which in turn sends a SIGINT to your sleep process.

Want proof? Try out the following:

$ docker run --rm fpco/pid1 sleep 60& [1] 417 $ kill -KILL $! $ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 69fbc70e95e2 fpco/pid1 "/sbin/pid1 sleep 60" 11 seconds ago Up 11 seconds hopeful_mayer [1]+ Killed docker run --rm fpco/pid1 sleep 60

In this case, we sent a SIGKILL to the docker run command. Unlike SIGINT or SIGTERM, and SIGKILL cannot be handled, and therefore docker run is unable to delegate signal handling to a different process. As a result, the docker run command itself dies, but the sleep process (and its container) continue running.

Some takeaways from this:

  • Make sure you use something like pid1 so that your SIGINT or SIGTERM to the docker run process actually get your container to reliably shut down
  • If you must send a SIGKILL to your process, use the docker kill command instead
Alternative to entrypoint

We've used --entrypoint /sbin/pid1 a lot here. In fact, each usage of that has been superfluous, since the fpco/pid1 and snoyberg/docker-testing images both use /sbin/pid1 as their default entrypoint anyway. I included it for explicitness. To prove it to you:

$ docker run --rm fpco/pid1 sleep 60 ^C$

But if you don't want to muck with entrypoints, you can always just include /sbin/pid1 at the beginning of your command, e.g.:

$ docker run --rm --entrypoint /usr/bin/env fpco/pid1 /sbin/pid1 sleep 60 ^C$

And if you have your own Docker image and you'd just like to include the pid1 executable, you can download it from the Github releases page.

Dockerfiles, command vs exec form

You may be tempted to put something like ENTRYPOINT /sbin/pid1 in your Dockerfile. Let's see why that won't work:

$ cat Dockerfile FROM fpco/pid1 ENTRYPOINT /sbin/pid1 $ docker build --tag test . Sending build context to Docker daemon 2.048 kB Step 1 : FROM fpco/pid1 ---> aef1f7b702b9 Step 2 : ENTRYPOINT /sbin/pid1 ---> Using cache ---> f875b43a9e40 Successfully built f875b43a9e40 $ docker run --rm test ps pid1: No arguments provided

The issue here is that we specified /sbin/pid1 in what Docker calls command form. This is just a raw string which is interpreted by the shell. It is unable to be passed an additional command (like ps), and therefore pid1 itself complains that it hasn't been told what to run. The correct way to specify your entrypoint is ENTRYPOINT ["/sbin/pid1"], e.g.:

$ cat Dockerfile FROM fpco/pid1 ENTRYPOINT ["/sbin/pid1"] $ docker build --tag test . Sending build context to Docker daemon 2.048 kB Step 1 : FROM fpco/pid1 ---> aef1f7b702b9 Step 2 : ENTRYPOINT /sbin/pid1 ---> Running in ba0fa8c5bd41 ---> 4835dec4aae6 Removing intermediate container ba0fa8c5bd41 Successfully built 4835dec4aae6 $ docker run --rm test ps PID TTY TIME CMD 1 ? 00:00:00 pid1 8 ? 00:00:00 ps

Generally speaking, you should stick with command form in your Dockerfiles at all times. It is explicit about whitespace handling, and avoids the need to use a shell as an interpreter.

Takeaways

The main takeaway here is: unless you have a good reason to do otherwise, you should use a minimal init process like pid1. The Phusion/my_init approach works, but may be too heavy weight for some. If you don't need syslog and other add-on features of Phusion, you're probably best with a minimal init instead.

As a separate but somewhat related comment: we're going to have a follow up post on this blog in the coming days explaining how we compiled the pid1 executable as a static executable to make it compatible with all various Linux flavors, and how you can do the same for your Haskell executables. Stay tuned!

Categories: Offsite Blogs

Douglas M. Auclair (geophf): September 2016 1HaskellADay problems and solutions

Planet Haskell - Sun, 10/02/2016 - 8:04pm
Categories: Offsite Blogs

Jasper Van der Jeugt: Patat and Myanmar travels

Planet Haskell - Sat, 10/01/2016 - 6:00pm
Presentations in the terminal

At work, I frequently need to give (internal) presentations and demos using video conferencing. I prefer to do these quick-and-dirty presentations in the terminal for a few reasons:

  • I don’t spend time worrying about layout, terminal stuff always looks cool.
  • I want to write markdown if possible.
  • You can have a good “Questions?” slide just by running cowsay 'Questions?'
  • Seamless switching between editor/shell and presentation using tmux.

The last point is important for video conferencing especially. The software we use allows you to share a single window from your desktop. This is pretty neat if you have a multi-monitor setup. However, it does not play well with switching between a PDF viewer and a terminal.

Introducing patat

To this end, I wrote patatPresentations And The ANSI Terminal – because I was not entirely happy with the available solutions. You can get it from Hackage: cabal install patat.

patat screenshot

You run it simply by doing:

patat presentation.md

The key features are:

  • Built on Pandoc:

    The software I was using before contained some Markdown parsing bugs. By using Pandoc under the hood, this should not happen.

    Additionally, we get all the input formats Pandoc supports (Literate Haskell is of particular importance to me) and some additional elements like tables and definition lists.

  • Smart slide splitting:

    Most Markdown presentation tools seem to split slides at --- (horizontal rulers). This is a bit verbose since you usually start each slide with an h1 as well. patat will check if --- is used and if it’s not, it will split on h1s instead.

  • Live reload:

    If you run patat --watch presentation.md, patat will poll the file for changes and reload automatically. This is really handy when you are writing the presentation, I usually use it with split-pane in tmux.

An example of a presentation is:

--- title: This is my presentation author: Jane Doe ... # This is a slide Slide contents. Yay. # Important title Things I like: - Markdown - Haskell - Pandoc - Traveling How patat came to be

I started writing a simple prototype of patat during downtime at ICFP2016, when I discovered that MDP was not able to parse my presentation correctly.

After ICFP, I flew to Myanmar, and I am currently traveling around the country with my girlfriend. It’s a super interesting place to visit, with a rich history. Now that NLD is the ruling party, I think it is a great time to visit the country responsibly.

Riding around visiting temples in Bagan

However, it is a huge country – the largest in south-east Asia – so there is some downtime traveling on domestic flights, buses and boats. I thought it was a good idea to improve the tool a bit further, since you don’t need internet to hack on this sort of thing.

Pull requests are welcome as always! Note that I will be slow to respond: for the next three days I will be trekking from Kalaw to Inle Lake, so I have no connectivity (or electricity, for that matter).

Sunset at U Bein bridge

Sidenote: “Patat” is the Flemish word for “potato”. Dutch people also use it to refer to French Fries but I don’t really do that – in Belgium we just call fries “Frieten”.

Categories: Offsite Blogs

JP Moresmau: Everything is broken

Planet Haskell - Sat, 10/01/2016 - 7:09am
This week was I suppose fairly typical. Started using a new library, the excellent sqlg that provides the TinkerPop graph API on top of relational databases. Found a bug pretty quickly. Off we go to contribute to another open source project, good for my street cred I suppose. Let’s fork it, and open the source code in IDEA (Community edition). After years of hearing abuse about Eclipse, I’m now trying to use “the best IDE ever” (say all the fan boys) instead. Well, that didn’t go so well, apparently importing a Maven project and resolving the dependencies proves too much for IDEA. I fought with it for a while, then gave up.
Fired up Eclipse, it opened and built the sqlg projects without a hitch. Wrote a test, fixed the bug, raised a PR, got it accepted with a thank you, life is good.
Then I find another bug. Except that upon investigation, it’s not in sqlg, it’s in the actual TinkerPop code. The generics on a map are wrong, there are values that are not instances of the key class (thanks generics type erasure!). So I can fix by changing the method signature, or change the keys. Both will break existing code. Sigh…
Oh, and the TinkerPop project doesn’t build in Eclipse. The Eclipse compiler chokes on some Java 8 code. Off to the Eclipse bug tracker. Maybe I need to have three different Java IDEs to be able to handle all the projects I may find bugs in.

Everything isbroken. Off I go to my own code to add my own bugs.
Categories: Offsite Blogs

Well-Typed.Com: Hackage reliability via mirroring

Planet Haskell - Fri, 09/30/2016 - 9:08am
TL;DR: Hackage now has multiple secure mirrors which can be used fully automatically by clients such as cabal.

In the last several years, as a community, we’ve come to greatly rely on services like Hackage and Stackage being available 24/7. There is always enormous frustration when either of these services goes down.

I think as a community we’ve also been raising our expectations. We’re all used to services like Google which appear to be completely reliable. Of course these are developed and operated by huge teams of professionals, whereas our community services are developed, maintained and operated by comparatively tiny teams on shoestring budgets.

A path to greater reliability

Nevertheless, reliability is important to us all, and so there has been a fair bit of effort put in over the last few years to improve reliability. I’ll talk primarily about Hackage since that is what I am familiar with.

Firstly, a couple years ago Hackage and haskell.org were moved from super-cheap VM hosting (where our machines tended to go down several times a year) to actually rather good quality hosting provided by Rackspace. Thanks to Rackspace for donating that, and the haskell.org infrastructure team for getting that organised and implemented. That in itself has made a huge difference: we’ve had far fewer incidents of downtime since then.

Obviously even with good quality hosting we’re still only one step away from unscheduled downtime, because the architecture is too centralised.

There were two approaches that people proposed. One was classic mirroring: spread things out over multiple mirrors for redundancy. The other proposal was to adjust the Hackage architecture somewhat so that while the main active Hackage server runs on some host, the the core Hackage archive would be placed on an ultra-reliable 3rd party service like AWS S3, so that this would stay available even if the main server was unavailable.

The approach we decided to take was the classic mirroring one. In some ways this is the harder path, but I think ultimately it gives the best results. This approach also tied in with the new security architecture (The Update Framework – TUF) that we were implementing. The TUF design includes mirrors and works in such a way that mirrors do not need to be trusted. If we (or rather end users) do not have to trust the operators of all the mirrors then this makes a mirroring approach much more secure and much easier to deploy.

Where we are today

The new system has been in beta for some time and we’re just short of flipping the switch for end users. The new Hackage security system in place on the server side, while on the client side, the latest release of cabal-install can be configured to use it, and the development version uses it by default.

There is lots to say about the security system, but that has (1, 2, 3) and will be covered elsewhere. This post is about mirroring.

For mirrors, we currently have two official public mirrors, and a third in the works. One mirror is operated by FP Complete and the other by Herbert Valerio Riedel. For now, Herbert and I manage the list of mirrors and we will be accepting contributions of further public mirrors. It is also possible to run private mirrors.

Once you are using a release of cabal-install that uses the new system then no further configuration is required to make use of the mirrors (or indeed the security). The list of public mirrors is published by the Hackage server (along with the security metadata) and cabal-install (and other clients using hackage-security) will automatically make use of them.

Reliability in the new system

Both of the initial mirrors are individually using rather reliable hosting. One is on AWS S3 and one on DreamHost S3. Indeed the weak point in the system is no longer the hosting. It is other factors like reliability of the hosts running the agents that do the mirroring, and the ever present possibility of human error.

The fact that the mirrors are hosted and operated independently is the key to improved reliability. We want to reduce the correlation of failures.

Failures in hosting can be mitigated by using multiple providers. Even AWS S3 goes down occasionally. Failures in the machines driving the mirroring are mitigated by using a normal decentralised pull design (rather than pushing out from the centre) and hosting the mirroring agents separately. Failures due to misconfiguration and other human errors are mitigated by having different mirrors operated independently by different people.

So all these failures can and will happen, but if they are not correlated and we have enough mirrors then the system overall can be quite reliable.

There is of course still the possibility that the upstream server goes down. It is annoying not to be able to upload new packages, but it is far more important that people be able to download packages. The mirrors mean there should be no interruption in the download service, and it gives the upstream server operators the breathing space to fix things.

Categories: Offsite Blogs

Neil Mitchell: Full-time Haskell jobs in London, at Barclays

Planet Haskell - Thu, 09/29/2016 - 6:26am
Summary: I'm hiring 9 Haskell programmers. Email neil.d.mitchell AT barclays.com to apply.

I work for Barclays, in London, working on a brand new Haskell project. We're looking for nine additional Haskell programmers to come and join the team.

What we offer

A permanent job, writing Haskell, using all the tools you know and love – GHC/Cabal/Stack etc. In the first two weeks in my role I've already written parsers with attoparsec, both Haskell and HTML generators and am currently generating a binding to C with lots of Storable/Ptr stuff. Since it's a new project you will have the chance to help shape the project.

The project itself is to write a risk engine – something that lets the bank calculate the value of the trades it has made, and how things like changes in the market change their value. Risk engines are important to banks and include lots of varied tasks – talking to other systems, overall structuring, actual computation, streaming values, map/reduce.

We'll be following modern but lightweight development principles – including nightly builds, no check-ins to master which break the tests (enforced automatically) and good test coverage.

These positions have attractive salary levels.

What we require

We're looking for the best functional programmers out there, with a strong bias towards Haskell. We have a range of seniorities available to suit even the most experienced candidates. We don't have anything at the very junior end; instead we're looking for candidates that are already fluent and productive. That said, a number of very good Haskell programmers think of themselves as beginners even after many years, so if you're not sure, please get in touch.

We do not require any finance knowledge.

The role is in London, Canary Wharf, and physical presence in the office on a regular basis is required – permanent remote working is not an option.

How to apply

To apply, email neil.d.mitchell AT barclays.com with a copy of your CV. If you have any questions, email me.

The best way to assess technical ability is to look at code people have written. If you have any packages on Hackage or things on GitHub, please point me at the best projects. If your best code is not publicly available, please describe the Haskell projects you've been involved in.

Categories: Offsite Blogs

Don Stewart (dons): Haskell dev roles with Strats @ Standard Chartered

Planet Haskell - Thu, 09/29/2016 - 2:17am

The Strats team at Standard Chartered is growing. We have 10 more open roles currently, in a range of areas:

  • Haskell dev for hedging effectiveness analytics, and build hedging services.
  • Haskell devs for derivatives pricing services. Generic roles using Haskell.
  • Web-experienced Haskell devs for frontends to analytics services written in Haskell. PureScript and or data viz, user interfaces skills desirable
  • Haskell dev for trading algorithms and strategy development.
  • Dev/ops role to extend our continuous integration infrastructure (Haskell+git)
  • Contract analysis and manipulation in Haskell for trade formats (FpML + Haskell).
  • Haskell dev for low latency (< 100 microsecond) components in soft real-time non-linear pricing charges service.

You would join an existing team of 25 Haskell developers in Singapore or London. Generally our roles involve directly working with traders to automate their work and improve their efficiency. We use Haskell for all tasks. Either GHC Haskell or our own (“Mu”) implementation, and this is a rare chance to join a large, experienced Haskell dev team.

We offer permanent or contractor positions, at Director and Associate Director level, with very competitive compensation. Demonstrated experience in typed FP (Haskell, OCaml, F# etc) is required or other typed FP.

All roles require some physical presence in either Singapore or London, and we offer flexiblity with these constraints (with work from home available). No financial background is required or assumed.

More info about our development process is in the 2012 PADL keynote, and a 2013 HaskellCast interview.

If this sounds exciting to you, please send your PDF resume to me – donald.stewart <at> sc.com


Tagged: jobs
Categories: Offsite Blogs

Well-Typed.Com: Sharing, Space Leaks, and Conduit and friends

Planet Haskell - Thu, 09/29/2016 - 12:20am
TL;DR: Sharing conduit values leads to space leaks. Make sure that conduits are completely reconstructed on every call to runConduit; this implies we have to be careful not to create any (potentially large) conduit CAFs (skip to the final section “Avoiding space leaks” for some details on how to do this). Similar considerations apply to other streaming libraries and indeed any Haskell code that uses lazy data structures to drive computation. Motivation

We use large lazy data structures in Haskell all the time to drive our programs. For example, consider

main1 :: IO () main1 = forM_ [1..5] $ \_ -> mapM_ print [1 .. 1000000]

It’s quite remarkable that this works and that this program runs in constant memory. But this stands on a delicate cusp. Consider the following minor variation on the above code:

ni_mapM_ :: (a -> IO b) -> [a] -> IO () {-# NOINLINE ni_mapM_ #-} ni_mapM_ = mapM_ main2 :: IO () main2 = forM_ [1..5] $ \_ -> ni_mapM_ print [1 .. 1000000]

This program runs, but unlike main1, it has a maximum residency of 27 MB; in other words, this program suffers from a space leak. As it turns out, main1 was running in constant memory because the optimizer was able to eliminate the list altogether (due to the fold/build rewrite rule), but it is unable to do so in main2.

But why is main2 leaking? In fact, we can recover constant space behaviour by recompiling the code with -fno-full-laziness. The full laziness transformation is effectively turning main2 into

longList :: [Integer] longList = [1 .. 1000000] main3 :: IO () main3 = forM_ [1..5] $ \_ -> ni_mapM_ print longList

The first iteration of the forM_ loop constructs the list, which is then retained to be used by the next iterations. Hence, the large list is retained for the duration of the program, which is the beforementioned space leak.

The full laziness optimization is taking away our ability to control when data structures are not shared. That ability is crucial when we have actions driven by large lazy data structures. One particularly important example of such lazy structures that drive computation are conduits or pipes. For example, consider the following conduit code:

import qualified Data.Conduit as C countConduit :: Int -> C.Sink Char IO () countConduit cnt = do mi <- C.await case mi of Nothing -> liftIO (print cnt) Just _ -> countConduit $! cnt + 1 getConduit :: Int -> C.Source IO Char getConduit 0 = return () getConduit n = do ch <- liftIO getChar C.yield ch getConduit (n - 1)

Here countConduit is a sink that counts the characters it receives from upstream, and getConduit n is a conduit that reads n characters from the console and passes them downstream.

To illustrate what might go wrong, we will use the following exception handler throughout this blog post5:

retry :: IO a -> IO a retry io = do ma <- try io case ma of Right a -> return a Left (_ :: SomeException) -> retry io

The important point to notice about this exception handler is that it retains a reference to the action io as it executes that action, since it might potentially have to execute it again if an exception is thrown. However, all the space leaks we discuss in this blog post arise even when an exception is never thrown and hence the action is run only once; simply maintaining a reference to the action until the end of the program is enough to cause the space leak.

If we use this exception handler as follows:

main :: IO () main = retry $ C.runConduit $ getConduit 1000000 C.=$= countConduit 0

we again end up with a large space leak, this time of type Pipe and ->Pipe (conduit’s internal type):

Although the values that stream through the conduit come from IO, the conduit itself is fully constructed and retained in memory. In this blog post we examine what exactly is being retained here, and why. We will finish with some suggestions on how to avoid such space-leaks, although sadly there is no easy answer. Note that these problems are not specific to the conduit library, but apply equally to all other similar libraries.

We will not assume any knowledge of conduit but start from first principles; however, if you have never used any of these libraries before this blog post is probably not the best starting point; you might for example first want to watch my presentation Lazy I/O and Alternatives in Haskell.

Lists

Before we look at the more complicated case, let’s first consider another program using just lists:

main :: IO () main = retry $ ni_mapM_ print [1..1000000]

This program suffers from a space leak for similar reasons to the example with lists we saw in the introduction, but it’s worth spelling out the details here: where exactly is the list being maintained?

Recall that the IO monad is effectively a state monad over a token RealWorld state (if that doesn’t make any sense to you, you might want to read ezyang’s article Unraveling the mystery of the IO monad first). Hence, ni_mapM_ (just a wrapper around mapM_) is really a function of three arguments: the action to execute for every element of the list, the list itself, and the world token. That means that

ni_mapM_ print [1..1000000]

is a partial application, and hence we are constructing a PAP object. Such a PAP object is an runtime representation of a partial application of a function; it records the function we want to execute (ni_mapM_), as well as the arguments we have already provided. It is this PAP object that we give to retry, and which retry retains until the action completes because it might need it in the exception handler. The long list in turn is being retained because there is a reference from the PAP object to the list (as one of the arguments that we provided).

Full laziness does not make a difference in this example; whether or not that [1 .. 10000000] expression gets floated out makes no difference.

Reminder: Conduits/Pipes

Just to make sure we don’t get lost in the details, let’s define a simple conduit-like or pipe-like data structure:

data Pipe i o m r = Yield o (Pipe i o m r) | Await (Either r i -> Pipe i o m r) | Effect (m (Pipe i o m r)) | Done r

A pipe or a conduit is a free monad which provides three actions:

  1. Yield a value downstream
  2. Await a value from upstream
  3. Execute an effect in the underlying monad.

The argument to Await is passed an Either; we give it a Left value if upstream terminated, or a Right value if upstream yielded a value.1

This definition is not quite the same as the one used in real streaming libraries and ignores various difficulties (in particular exception safely, as well as other features such as leftovers); however, it will suffice for the sake of this blog post. We will use the terms “conduit” and “pipe” interchangeably in the remainder of this article.

Sources

The various Pipe constructors differ in their memory behaviour and the kinds of space leaks that they can create. We therefore consider them one by one. We will start with sources, because their memory behaviour is relatively straightforward.

A source is a pipe that only ever yields values downstream.2 For example, here is a source that yields the values [n, n-1 .. 1]:

yieldFrom :: Int -> Pipe i Int m () yieldFrom 0 = Done () yieldFrom n = Yield n $ yieldFrom (n - 1)

We could “run” such a pipe as follows:

printYields :: Show o => Pipe i o m () -> IO () printYields (Yield o k) = print o >> printYields k printYields (Done ()) = return ()

If we then run the following program:

main :: IO () main = retry $ printYields (yieldFrom 1000000)

we get a space leak. This space leak is very similar to the space leak we discussed in section Lists above, with Done () playing the role of the empty list and Yield playing the role of (:). As in the list example, this program has a space leak independent of full laziness.

Sinks

A sink is a conduit that only ever awaits values from upstream; it never yields anything downstream.2 The memory behaviour of sinks is considerably more subtle than the memory behaviour of sources and we will examine it in detail. As a reminder, the constructor for Await is

data Pipe i o m r = Await (Either r i -> Pipe i o m r) | ...

As an example of a sink, consider this pipe that counts the number of characters it receives:

countChars :: Int -> Pipe Char o m Int countChars cnt = Await $ \mi -> case mi of Left _ -> Done cnt Right _ -> countChars $! cnt + 1

We could “run” such a sink by feeding it a bunch of characters; say, 1000000 of them:

feed :: Char -> Pipe Char o m Int -> IO () feed ch = feedFrom 10000000 where feedFrom :: Int -> Pipe Char o m Int -> IO () feedFrom _ (Done r) = print r feedFrom 0 (Await k) = feedFrom 0 $ k (Left 0) feedFrom n (Await k) = feedFrom (n-1) $ k (Right ch)

If we run this as follows and compile with optimizations enabled, we once again end up with a space leak:

main :: IO () main = retry $ feed 'A' (countChars 0)

We can recover constant space behaviour by disabling full laziness; however, the effect of full laziness on this example is a lot more subtle than the example we described in the introduction.

Full laziness

Let’s take a brief moment to describe what full laziness is, exactly. Full laziness is one of the optimizations that ghc applies by default when optimizations are enabled; it is described in the paper “Let-floating: moving bindings to give faster programs”. The idea is simple; if we have something like

f = \x y -> let e = .. -- expensive computation involving x but not y in ..

full laziness floats the let binding out over the lambda to get

f = \x = let e = .. in \y -> ..

This potentially avoids unnecessarily recomputing e for different values of y. Full laziness is a useful transformation; for example, it turns something like

f x y = .. where go = .. -- some local function

into

f x y = .. f_go .. = ..

which avoids allocating a function closure every time f is called. It is also quite a notorious optimization, because it can create unexpected CAFs (constant applicative forms; top-level definitions of values); for example, if you write

nthPrime :: Int -> Int nthPrime n = allPrimes !! n where allPrimes :: [Int] allPrimes = ..

you might expect nthPrime to recompute allPrimes every time it is invoked; but full laziness might move that allPrimes definition to the top-level, resulting in a large space leak (the full list of primes would be retained for the lifetime of the program). This goes back to the point we made in the introduction: full laziness is taking away our ability to control when values are not shared.

Full laziness versus sinks

Back to the sink example. What exactly is full laziness doing here? Is it constructing a CAF we weren’t expecting? Actually, no; it’s more subtle than that. Our definition of countChars was

countChars :: Int -> Pipe Char o m Int countChars cnt = Await $ \mi -> case mi of Left _ -> Done cnt Right _ -> countChars $! cnt + 1

Full laziness is turning this into something more akin to

countChars' :: Int -> Pipe Char o m Int countChars' cnt = let k = countChars' $! cnt + 1 in Await $ \mi -> case mi of Left _ -> Done cnt Right _ -> k

Note how the computation of countChars' $! cnt + 1 has been floated over the lambda; ghc can do that, since this expression does not depend on mi. So in memory the countChars 0 expression from our main function (retained, if you recall, because of the surrounding retry wrapper), develops something like this. It starts of as a simple thunk:

Then when feed matches on it, it gets reduced to weak head normal form, exposing the top-most Await constructor:

The body of the await is a function closure pointing to the function inside countChars (\mi -> case mi ..), which has countChars $! (cnt + 1) as an unevaluated thunk in its environment. Evaluating it one step further yields

So where for a source the data structure in memory was a straightforward “list” consisting of Yield nodes, for a sink the situation is more subtle: we build up a chain of Await constructors, each of which points to a function closure which in its environment has a reference to the next Await constructor. This wouldn’t matter of course if the garbage collector could clean up after us; but if the conduit itself is shared, then this results in a space leak.

Without full laziness, incidentally, evaluating countChars 0 yields

and the chain stops there; the only thing in the function closure now is cnt. Since we don’t allocate the next Yield constructor before running the function, we never construct a chain of Yield constructors and hence we have no space leak.

Depending on values

It is tempting to think that if the conduit varies its behaviour depending on the values it receives from upstream the same chain of Await constructors cannot be constructed and we avoid a space leak. For example, consider this variation on countChars which only counts spaces:

countSpaces :: Int -> Pipe Char o m Int countSpaces cnt = Await $ \mi -> case mi of Left _ -> Done cnt Right ' ' -> countSpaces $! cnt + 1 Right _ -> countSpaces $! cnt

If we substitute this conduit for countChars in the previous program, do we fare any better? Alas, the memory behaviour of this conduit, when shared, is in fact far, far worse.

The reason is that both the countSpaces $! cnt + 1 and the expression countSpaces $! cnt can both be floated out by the full laziness optimization. Hence, now every Await constructor will have a function closure in its payload with two thunks, one for each alternative way to execute the conduit. What’s more, both of these thunks will are retained as long as we retain a reference to the top-level conduit.

We can neatly illustrate this using the following program:

main :: IO () main = do let count = countSpaces 0 feed ' ' count feed ' ' count feed ' ' count feed 'A' count feed 'A' count feed 'A' count

The first feed ' ' explores a path through the conduit where every character is a space; so this constructs (and retains) one long chain of Await constructors. The next two calls to feed ' ' however walk over the exact same path, and hence memory usage does not increase for a while. But then we explore a different path, in which every character is a non-space, and hence memory behaviour will go up again. Then during the second call to feed 'A' memory usage is stable again, until we start executing the last feed 'A', at which point the garbage collector can finally start cleaning things up:

What’s worse, there is an infinite number of paths through this conduit. Every different combination of space and non-space characters will explore a different path, leading to combinatorial explosion and terrifying memory usage.

Effects

The precise situation for effects depends on the underlying monad, but let’s explore one common case: IO. As we will see, for the case of IO the memory behaviour of Effect is actually similar to the memory behaviour of Await. Recall that the Effect constructor is defined as

data Pipe i o m r = Effect (m (Pipe i o m r)) | ...

Consider this simple pipe that prints the numbers [n, n-1 .. 1]:

printFrom :: Int -> Pipe i o IO () printFrom 0 = Done () printFrom n = Effect $ print n >> return (printFrom (n - 1))

We might run such a pipe using3:

runPipe :: Show r => Pipe i o IO r -> IO () runPipe (Done r) = print r runPipe (Effect k) = runPipe =<< k

In order to understand the memory behaviour of Effect, we need to understand how the underlying monad behaves. For the case of IO, IO actions are state transformers over a token RealWorld state. This means that the Effect constructor actually looks rather similar to the Await constructor. Both have a function as payload; Await a function that receives an upstream value, and Effect a function that receives a RealWorld token. To illustrate what printFrom might look like with full laziness, we can rewrite it as

printFrom :: Int -> Pipe i o IO () printFrom n = let k = printFrom (n - 1) in case n of 0 -> Done () _ -> Effect $ IO $ \st -> unIO (print n >> return k) st

If we visualize the heap (using ghc-vis), we can see that it does indeed look very similar to the picture for Await:

Increasing sharing

If we cannot guarantee that our conduits are not shared, then perhaps we should try to increase sharing instead. If we can avoid allocating these chains of pipes, but instead have pipes refer back to themselves, perhaps we can avoid these space leaks.

In theory, this is possible. For example, when using the conduit library, we could try to take advantage of monad transformers and rewrite our feed source and our count sink as:

feed :: Source IO Char feed = evalStateC 1000000 go where go :: Source (StateT Int IO) Char go = do st <- get if st == 0 then return () else do put $! (st - 1) ; yield 'A' ; go count :: Sink Char IO Int count = evalStateC 0 go where go :: Sink Char (StateT Int IO) Int go = do mi <- await case mi of Nothing -> get Just _ -> modify' (+1) >> go

In both definitions go refers back to itself directly, with no arguments; hence, it ought to be self-referential, without any long chain of sources or sinks ever being constructed. This works; the following program runs in constant space:

main :: IO () main = retry $ print =<< (feed $$ count)

However, this kind of code is extremely brittle. For example, consider the following minor variation on count:

count :: Sink Char IO Int count = evalStateC 0 go where go :: Sink Char (StateT Int IO) Int go = withValue $ \_ -> modify' (+1) >> go withValue :: (i -> Sink i (StateT Int IO) Int) -> Sink i (StateT Int IO) Int withValue k = do mch <- await case mch of Nothing -> get Just ch -> k ch

This seems like a straight-forward variation, but this code in fact suffers from a space leak again4. The optimized core version of this variation of count looks something like this:

count :: ConduitM Char Void (StateT Int IO) Int count = ConduitM $ \k -> let countRec = modify' (+ 1) >> count in unConduitM await $ \mch -> case mch of Nothing -> unConduitM get k Just _ -> unConduitM countRec k

In the conduit library, ConduitM is a codensity transformation of an internal Pipe datatype; the latter corresponds more or less to the Pipe datastructure we’ve been describing here. But we can ignore these details: the important point here is that this has the same typical shape that we’ve been studying above, with an allocation inside a lambda but before an await.

We can fix it by writing our code as

count :: Sink Char IO Int count = evalStateC 0 go where go :: Sink Char (StateT Int IO) Int go = withValue goWithValue goWithValue :: Char -> Sink Char (StateT Int IO) Int goWithValue _ = modify' (+1) >> go withValue :: (i -> Sink i (StateT Int IO) Int) -> Sink i (StateT Int IO) Int withValue k = do mch <- await case mch of Nothing -> get Just ch -> k ch

Ironically, it would seem that full laziness here could have helped us by floating out that modify' (+1) >> go expression for us. The reason that it didn’t is probably related to the exact way the k continuation is threaded through in the compiled code (I simplified a bit above). Whatever the reason, tracking down problems like these is difficult and incredibly time consuming; I’ve spent many many hours studying the output of -ddump-simpl and comparing before and after pictures. Not a particularly productive way to spend my time, and this kind of low-level thinking is not what I want to do when writing application level Haskell code!

Composed pipes

Normally we construct pipes by composing components together. Composition of pipes can be defined as

(=$=) :: Monad m => Pipe a b m r -> Pipe b c m r -> Pipe a c m r {-# NOINLINE (=$=) #-} _ =$= Done r = Done r u =$= Effect d = Effect $ (u =$=) <$> d u =$= Yield o d = Yield o (u =$= d) Yield o u =$= Await d = u =$= d (Right o) Await u =$= Await d = Await $ \ma -> u ma =$= Await d Effect u =$= Await d = Effect $ (=$= Await d) <$> u Done r =$= Await d = Done r =$= d (Left r)

The downstream pipe “is in charge”; the upstream pipe only plays a role when downstream awaits. This mirrors Haskell’s lazy “demand-driven” evaluation model.

Typically we only run self-contained pipes that don’t have any Awaits or Yields left (after composition), so we are only left with Effects. The good news is that if the pipe components don’t consist of long chains, then their composition won’t either; at every Effect point we wait for either upstream or downstream to complete its effect; only once that is done do we receive the next part of the pipeline and hence no chains can be constructed.

On the other hand, of course composition doesn’t get rid of these space leaks either. As an example, we can define a pipe equivalent to the getConduit from the introduction

getN :: Int -> Pipe i Char IO Int getN 0 = Done 0 getN n = Effect $ do ch <- getChar return $ Yield ch (getN (n - 1))

and then compose getN and countChars to get a runnable program:

main :: IO () main = retry $ runPipe $ getN 1000000 =$= countChars 0

This program suffers from the same space leaks as before because the individual pipelines component are kept in memory. As in the sink example, memory behaviour would be much worse still if there was different paths through the conduit network.

Summary

At Well-Typed we’ve been developing an application for a client to do streaming data processing. We’ve been using the conduit library to do this, with great success. However, occassionally space leaks arise that difficult to fix, and even harder to track down; of course, we’re not the first to suffer from these problems; for example, see ghc ticket #9520 or issue #6 for the streaming library (a library similar to conduit).

In this blog post we described how such space leaks arise. Similar space leaks can arise with any kind of code that uses large lazy data structures to drive computation, including other streaming libraries such as pipes or streaming, but the problem is not restricted to streaming libraries.

The conduit library tries to avoid these intermediate data structures by means of fusion rules; naturally, when this is successful the problem is avoided. We can increase the likelihood of this happening by using combinators such as folds etc., but in general the intermediate pipe data structures are difficult to avoid.

The core of the problem is that in the presence of the full laziness optimization we have no control over when values are not shared. While it is possible in theory to write code in such a way that the lazy data structures are self-referential and hence keeping them in memory does not cause a space leak, in practice the resulting code is too brittle and writing code like this is just too difficult. Just to provide one more example, in our application we had some code that looked like this:

go x@(C y _) = case y of Constr1 -> doSomethingWith x >> go Constr2 -> doSomethingWith x >> go Constr3 -> doSomethingWith x >> go Constr4 -> doSomethingWith x >> go Constr5 -> doSomethingWith x >> go

This worked and ran in constant space. But after adding a single additional clause to this pattern match, suddenly we reintroduced a space leak again:

go x@(C y _) = case y of Constr1 -> doSomethingWith x >> go Constr2 -> doSomethingWith x >> go Constr3 -> doSomethingWith x >> go Constr4 -> doSomethingWith x >> go Constr5 -> doSomethingWith x >> go Constr6 -> doSomethingWith x >> go

This was true even when that additional clause was never used; it had nothing to do with the change in the runtime behaviour of the code. Instead, when we added the additional clause some limit got exceeded in ghc’s bowels and suddenly something got allocated that wasn’t getting allocated before.

Full laziness can be disabled using -fno-full-laziness, but sadly this throws out the baby with the bathwater. In many cases, full laziness is a useful optimization. In particular, there is probably never any point allocation a thunk for something that is entirely static. We saw one such example above; it’s unexpected that when we write

go = withValue $ \_ -> modify' (+1) >> go

we get memory allocations corresponding to the modify' (+1) >> go expression.

Avoiding space leaks

So how do we avoid these space leaks? The key idea is pretty simple: we have to make sure the conduit is fully reconstructed on every call to runConduit. Conduit code typically looks like

runMyConduit :: Some -> Args -> IO r runMyConduit some args = runConduit $ stage1 some =$= stage2 args ... =$= stageN

You should put all top-level calls to runConduit into a module of their own, and disable full laziness in that module by declaring

{-# OPTIONS_GHC -fno-full-laziness #-}

at the top of the file. This means the computation of the conduit (stage1 =$= stage2 .. =$= stageN) won’t get floated to the top and the conduit will be recomputed on every invocation of runMyConduit (note that this relies on runMyConduit to have some arguments; if it doesn’t, you should add a dummy one).

This might not be enough, however. In the example above, stageN is still a CAF, and the evalation of the conduit stage1 =$= ... =$= stageN will cause that CAF to be evaluated and potentially retained in memory. CAFs are fine for conduits that are guaranteed to be small, or that loop back onto themselves; however, as discussed in section “Increasing sharing”, writing such conduit values is not an easy task, although it is manageable for simple conduits.

To avoid CAFs, conduis like stageN must be given a dummy argument and full laziness must be disabled for the module where stageN is defined. But it’s more subtle than that; even if a conduit does have real (non-dummy) arguments, part of that conduit might still be independent of those arguments and hence be floated to the top by the full laziness optimization, creating yet more unwanted CAF values. Full laziness must again be disabled to stop this from happening.

If you are sure that full laziness cannot float anything harmful to the top, you can leave it enabled; however, verifying that this is the case is highly non-trivial. You can of course test the code, but if you are unlucky the memory leak will only arise under certain specific usage conditions. Moreover, a small modification to the codebase, the libraries it uses, or even the compiler, perhaps years down the line, might change the program and reintroduce a memory leak.

Proceed with caution.

Further reading Addendum 1: ghc’s “state hack”

Let’s go back to the section about sinks; if you recall, we considered this example:

countChars :: Int -> Pipe Char o m Int countChars cnt = let k = countChars $! cnt + 1 in Await $ \mi -> case mi of Left _ -> Done cnt Right _ -> k feedFrom :: Int -> Pipe Char o m Int -> IO () feedFrom n (Done r) = print r feedFrom 0 (Await k) = feedFrom 0 $ k (Left 0) feedFrom n (Await k) = feedFrom (n - 1) $ k (Right 'A') main :: IO () main = retry $ feedFrom 10000000 (countChars 0)

We explained how countChars 0 results in a chain of Await constructors and function closures. However, you might be wondering, why would this be retained at all? After all, feedFrom is just an ordinary function, albeit one that computes an IO action. Why shouldn’t the whole expression

feedFrom 10000000 (countChars 0)

just be reduced to a single print 10000000 action, leaving no trace of the pipe at all? Indeed, this is precisely what happens when we disable ghc’s “state hack”; if we compile this program with -fno-state-hack it runs in constant space.

So what is the state hack? You can think of it as the opposite of the full laziness transformation; where full laziness transforms

\x -> \y -> let e = <expensive> in .. ~~> \x -> let e = <expensive> in \y -> ..

the state hack does the opposite

\x -> let e = <expensive> in \y -> .. ~~> \x -> \y -> let e = <expensive> in ..

though only for arguments y of type State# <token>. In general this is not sound, of course, as it might duplicate work; hence, the name “state hack”. Joachim Breitner’s StackOverflow answer explains why this optimization is necessary; my own blog post Understanding the RealWorld provides more background.

Let’s leave aside the question of why this optimization exists, and consider the effect on the code above. If you ask ghc to dump the optimized core (-ddump-stg), and translate the result back to readable Haskell, you will realize that it boils down to a single line change. With the state hack disabled the last line of feedFrom is effectively:

feedFrom n (Await k) = IO $ unIO (feedFrom (n - 1) (k (Right 'A')))

where IO and unIO just wrap and unwrap the IO monad. But when the state hack is enabled (the default), this turns into

feedFrom n (Await k) = IO $ \w -> unIO (feedFrom (n - 1) (k (Right 'A'))) w

Note how this floats the recursive call to feedFrom into the lambda. This means that

feedFrom 10000000 (countChars 0)

no longer reduces to a single print statement (after an expensive computation); instead, it reduces immediately to a function closure, waiting for its world argument. It’s this function closure that retains the Await/function chain and hence causes the space leak.

Addendum 2: Interaction with cost-centres (SCC)

A final cautionary tale. Suppose we are studying a space leak, and so we are compiling our code with profiling enabled. At some point we add some cost centres, or use -fprof-auto perhaps, and suddenly find that the space leak disappeared! What gives?

Consider one last time the sink example. We can make the space leak disappear by adding a single cost centre:

feed :: Char -> Pipe Char o m Int -> IO () feed ch = feedFrom 10000000 where feedFrom :: Int -> Pipe Char o m Int -> IO () feedFrom n p = {-# SCC "feedFrom" #-} case (n, p) of (_, Done r) -> print r (0, Await k) -> feedFrom 0 $ k (Left 0) (_, Await k) -> feedFrom (n-1) $ k (Right ch)

Adding this cost centre effectively has the same result as specifying -fno-state-hack; with the cost centre present, the state hack can no longer float the computations into the lambda.

Footnotes
  1. The ability to detect upstream termination is one of the characteristics that sets conduit apart from the pipes package, in which this is impossible (or at least hard to do). Personally, I consider this an essential feature. Note that the definition of Pipe in conduit takes an additional type argument to avoid insisting that the type of the upstream return value matches the type of the downstream return value. For simplicity I’ve omitted this additional type argument here.

  2. Sinks and sources can also execute effects, of course; since we are interested in the memory behaviour of the indvidual constructors, we treat effects separately.

  3. runPipe is (close to) the actual runPipe we would normally use; we connect pipes that await or yield into a single self contained pipe that does neither.

  4. For these simple examples actually the optimizer can work its magic and the space leak doesn’t appear, unless evalStateC is declared NOINLINE. Again, for larger examples problems arise whether it’s inlined or not.

  5. The original definition of retry used in this blogpost was

    retry io = catch io (\(_ :: SomeException) -> retry io)

    but as Eric Mertens rightly points out, this is not correct as catch runs the exception handler with exceptions masked. For the purposes of this blog post however the difference is not important; in fact, none of the examples in this blog post run the exception handler at all.

Categories: Offsite Blogs

Michael Snoyman: Respect

Planet Haskell - Wed, 09/28/2016 - 6:00pm

As I'm sure many people in the Haskell community have seen, Simon PJ put out an email entitled "Respect". If you haven't read it yet, I think you should. As is usually the case, Simon shows by example what we should strive for.

I put out a Tweet referring to a Gist I wrote two weeks back. At the time, I did not put the content on this blog, as I didn't want to make a bad situation worse. However, especially given Simon's comments, now seems like a good time to put out this message in the same medium (this blog) that the original inflammatory messaging came out in:

A few weeks back I wrote a blog post (and a second clarifying post) on what I called the Evil Cabal. There is no sense in repeating the content here, or even referencing it. The title is the main point.

It was a mistake, and an offensive one, to use insulting terms like evil in that blog post. What I said is true: I have taken to using that term when discussing privately some of the situation that has occured. I now see that that was the original problem: while the term started as a joke and a pun, it set up a bad precedent for conversation. I should not have used it privately, and definitely should not have publicized it.

To those active members in projects I maligned, I apologize. I should not have brought the discourse to that level.

Categories: Offsite Blogs

FP Complete: Updated Hackage mirroring

Planet Haskell - Tue, 09/27/2016 - 6:00am

As we've discussed on this blog before, FP Complete has been running a Hackage mirror for quite a few years now. In addition to a straight S3-based mirror of raw Hackage content, we've also been running some Git repos providing the same content in an arguably more accessible format (all-cabal-files, all-cabal-hashes, and all-cabal-metadata).

In the past, we did all of this mirroring using Travis, but had to stop doing so a few months back. Also, a recent revelation showed that the downloads we were making were not as secure as I'd previously believed (due to lack of SSL between the Hackage server and its CDN). Finally, there's been off-and-on discussion for a while about unifying on one Hackage mirroring tool. After some discussion among Duncan, Herbert, and myself, all of these goals ended up culminating in this mailing list post

This blog post details the end result of these efforts: where code is running, where it's running, how secret credentials are handled, and how we monitor the whole thing.

Code

One of the goals here was to use the new hackage-security mechanism in Hackage to validate the package tarballs and cabal file index downloaded from Hackage. This made it natural to rely on Herbert's hackage-mirror-tool code, which supports downloads, verification, and uploading to S3. There were a few minor hiccups getting things set up, but overall it was surprisingly easy to integrate, especially given that Herbert's code had previously never been used against Amazon S3 (it had been used against the Dreamhost mirror).

I made a few downstream modifications to the codebase to make it compatible with officially released versions of Cabal, Stackify it, and in the process generate Docker images. I also included a simple shell script for running the tool in a loop (based on Herbert's README instructions). The result is the snoyberg/hackage-mirror-tool Docker image.

After running this image (we'll get to how it's run later), we have a fully populated S3 mirror of Hackage guaranteeing a consistent view of Hackage (i.e., all package tarballs are available, without CDN caching issues in place). The next step is to use this mirror to populated the Git repositories. We already have all-cabal-hashes-tool and all-cabal-metadata-tool for updating the appropriate repos, and all-cabal-files is just a matter of running a tar xf on the tarball containing .cabal files. Putting all of this together, I set up the all-cabal-tool repo, containing:

  • run-inner.sh will:
    • Grab the 01-index.tar.gz file from the S3 mirror
    • Update the all-cabal-files repo
    • Use git archive in that repo to generate and update the 00-index.tar.gz file*
    • Update the all-cabal-hashes and all-cabal-metadata repos using the appropriate tools
  • run.sh uses the hackage-watcher to run run-inner.sh each time a new version of 01-index.tar.gz is available. It's able to do a simple ETag check, saving on bandwidth, disk IO, and CPU usage.
  • Dockerfile pulls in all of the relevant tools and provides a commercialhaskell/all-cabal-tool Docker image
  • You may notice some other code in that repo. I did have intention of rewriting the Bash scripts and other Haskell code into a single Haskell executable for simplicity, but didn't get around to it yet. If anyone's interested in taking up the mantle on that, let me know.

* About this 00/01 business: 00-index.tar.gz is the original package format, without hackage-security, and is used by previous cabal-install releases, as well as Stack and possibly some other tools too. hackage-mirror-tool does not mirror this file since it has no security information, so generating it from the known-secure 01-index.tar.gz file (via the all-cabal-files repo) seemed the best option.

In setting up these images, I decided to split them into two pieces instead of combining them so that the straight Hackage mirroring bits would remain unaffected by the rest of the code, since the Hackage mirror (as we'll see later) will be available for users outside of the all-cabal* set of repos.

At the end of this, you can see that we're no longer using the original hackage-mirror code that powered the FP Complete S3 mirror for years. Unification achieved!

Kubernetes

As I mentioned, we previously ran all of this mirroring code on Travis, but had to move off of it. Anyone who's worked with me knows that I hate being a system administrator, so it was a painful few months where I had to run this code myself on an EC2 machine I set up personally. Fortunately, FP Complete runs a Kubernetes cluster these days, and that means I don't need to be a system administrator :). As mentioned, I packaged up all of the code above in two Docker images, so running them on Kubernetes is very straightforward.

For the curious, I've put the Kubernetes deployment configurations in a Gist.

Credentials

We have a few different credentials that need to be shared with these Docker containers:

  • AWS credentials for uploading
  • GPG key for signing tags
  • SSH key for pushing to Github

One of the other nice things about Kubernetes (besides allowing me to not be a sysadmin) is that it has built-in secrets support. I obviously won't be sharing those files with you, but if you look at the deployment configs I shared before, you can see how they are being referenced.

Monitoring

One annoyance I've had in the past is, if there's a bug in the scripts or some system problem, mirroring will stop for many hours before I become aware of it. I was determined to not let that be a problem again. So I put together the Hackage Mirror status page. It compares the last upload date from Hackage itself against the last modified time on various S3 artifacts, as well as the last commit for the Git repos. If any of the mirrors fall more than an hour behind Hackage itself, it returns a 500 status code. That's not technically the right code to use, but it does mean that normal HTTP monitoring/alerting tools can be used to watch that page and tell me if anything has gone wrong.

If you're curious to see the code powering this, it's available on Github.

Official Hackage mirror

With the addition of the new hackage-security metadata files to our S3 mirror, one nice benefit is that the FP Complete mirror is now an official Hackage mirror, and can be used natively by cabal-install without having to modify any configuration files. Hopefully this will be useful to end users.

And strangely enough, just as I finished this blog post, I got my first "mirrors out of sync" 500 error message ever, proving that the monitoring itself works (even if the mirroring had a bug).

What's next?

Hopefully nothing! I've spent quite a bit more time on this in the past few weeks than I'd hoped, but I'm happy with the end result. I feel confident that the mirroring processes will run reliably, I understand and trust the security model from end to end, and there's less code and machines to maintain overall.

Thank you!

Many thanks to Duncan and Herbert for granting me access to the private Hackage server to work around CDN caching issues, and to Herbert for the help and quick fixes with hackage-mirror-tool.

Categories: Offsite Blogs

Ken T Takusagawa: [rotqywrk] foldl foldr

Planet Haskell - Mon, 09/26/2016 - 5:03pm

foldl: (x * y) * z

foldr: x * (y * z)

Also a nice reference: https://wiki.haskell.org/Foldr_Foldl_Foldl'

Categories: Offsite Blogs

Functional Jobs: Senior Backend Engineer at Euclid Analytics (Full-time)

Planet Haskell - Mon, 09/26/2016 - 3:53pm

We are looking to add a senior individual contributor to the backend engineering team! Our team is responsible for creating and maintaining the infrastructure that powers the Euclid Analytics Engine. We leverage a forward thinking and progressive stack built in Scala and Python, with an infrastructure that uses Mesos, Spark and Kafka. As a senior engineer you will build out our next generation ETL pipeline. You will need to use and build tools to interact with our massive data set in as close to real time as possible. If you have previous experience with functional programming and distributed data processing tools such as Spark and Hadoop, then you would make a great fit for this role!

RESPONSIBILITIES:

  • Partnering with the data science team to architect and build Euclid’s big data pipeline
  • Building tools and services to maintain a robust, scalable data service layer
  • Leverage technologies such as Spark and Kafka to grow our predictive analytics and machine learning capabilities in real time
  • Finding innovative solutions to performance issues and bottlenecks

REQUIREMENTS:

  • At least 3 years industry experience in a full time role utilizing Scala or other modern functional programming languages (Haskell, Clojure, Lisp, etc.)
  • Database management experience (MySQL, Redis, Cassandra, Redshift, MemSQL)
  • Experience with big data infrastructure including Spark, Mesos, Scalding and Hadoop
  • Excited about data flow and orchestration with tools like Kafka and Spark Streaming
  • Have experience building production deployments using Amazon Web Services or Heroku’s Cloud Application Platform
  • B.S. or equivalent in Computer Science or another technical field

Get information on how to apply for this position.

Categories: Offsite Blogs

Derek Elkins: Quotient Types for Programmers

Planet Haskell - Thu, 09/22/2016 - 11:21pm
Introduction

Programmers in typed languages with higher order functions and algebraic data types are already comfortable with most of the basic constructions of set/type theory. In categorical terms, those programmers are familiar with finite products and coproducts and (monoidal/cartesian) closed structure. The main omissions are subset types (equalizers/pullbacks) and quotient types (coequalizers/pushouts) which would round out limits and colimits. Not having a good grasp on either of these constructions dramatically shrinks the world of mathematics that is understandable, but while subset types are fairly straightforward, quotient types are quite a bit less intuitive.

Subset Types

In my opinion, most programmers can more or less immediately understand the notion of a subset type at an intuitive level.
A subset type is just a type combined with a predicate on that type that specifies which values of the type we want. For example, we may have something like { n:Nat | n /= 0 } meaning the type of naturals not equal to #0#. We may use this in the type of the division function for the denominator. Consuming a value of a subset type is easy, a natural not equal to #0# is still just a natural, and we can treat it as such. The difficult part is producing a value of a subset type. To do this, we must, of course, produce a value of the underlying type — Nat in our example — but then we must further convince the type checker that the predicate holds (e.g. that the value does not equal #0#). Most languages provide no mechanism to prove potentially arbitrary facts about code, and this is why they do not support subset types. Dependently typed languages do provide such mechanisms and thus either have or can encode subset types. Outside of dependently typed languages the typical solution is to use an abstract data type and use a runtime check when values of that abstract data type are created.

Quotient Types

The dual of subset types are quotient types. My impression is that this construction is the most difficult basic construction for people to understand. Further, programmers aren’t much better off, because they have little to which to connect the idea. Before I give a definition, I want to provide the example with which most people are familiar: modular (or clock) arithmetic. A typical way this is first presented is as a system where the numbers “wrap-around”. For example, in arithmetic mod #3#, we count #0#, #1#, #2#, and then wrap back around to #0#. Programmers are well aware that it’s not necessary to guarantee that an input to addition, subtraction, or multiplication mod #3# is either #0#, #1#, or #2#. Instead, the operation can be done and the mod function can be applied at the end. This will give the same result as applying the mod function to each argument at the beginning. For example, #4+7 = 11# and #11 mod 3 = 2#, and #4 mod 3 = 1# and #7 mod 3 = 1# and #1+1 = 2 = 11 mod 3#.

For mathematicians, the type of integers mod #n# is represented by the quotient type #ZZ//n ZZ#. The idea is that the values of #ZZ // n ZZ# are integers except that we agree that any two integers #a# and #b# are treated as equal if #a - b = kn# for some integer #k#. For #ZZ // 3 ZZ#, #… -6 = -3 = 0 = 3 = 6 = …# and #… = -5 = -2 = 1 = 4 = 7 = …# and #… = -4 = -1 = 2 = 5 = 8 = …#.

Equivalence Relations

To start to formalize this, we need the notion of an equivalence relation. An equivalence relation is a binary relation #(~~)# which is reflexive (#x ~~ x# for all #x#), symmetric (if #x ~~ y# then #y ~~ x#), and transitive (if #x ~~ y# and #y ~~ z# then #x ~~ z#). We can check that “#a ~~ b# iff there exists an integer #k# such that #a-b = kn#” defines an equivalence relation on the integers for any given #n#. For reflexivity we have #a - a = 0n#. For symmetry we have if #a - b = kn# then #b - a = -kn#. Finally, for transitivity we have if #a - b = k_1 n# and #b - c = k_2 n# then #a - c = (k_1 + k_2)n# which we get by adding the preceding two equations.

Any relation can be extended to an equivalence relation. This is called the reflexive-, symmetric-, transitive-closure of the relation. For an arbitrary binary relation #R# we can define the equivalence relation #(~~_R)# via “#a ~~_R b# iff #a = b# or #R(a, b)# or #b ~~_R a# or #a ~~_R c and c ~~_R b# for some #c#“. To be precise, #~~_R# is the smallest relation satisfying those constraints. In Datalog syntax, this looks like:

eq_r(A, A). eq_r(A, B) :- r(A, B). eq_r(A, B) :- eq_r(B, A). eq_r(A, B) :- eq_r(A, C), eq_r(C, B). Quotient Types: the Type Theory view

If #T# is a type, and #(~~)# is an equivalence relation, we use #T // ~~# as the notation for the quotient type, which we read as “#T# quotiented by the equivalence relation #(~~)#”. We call #T# the underlying type of the quotient type. We then say #a = b# at type #T // ~~# iff #a ~~ b#. Dual to subset types, to produce a value of a quotient type is easy. Any value of the underlying type is a value of the quotient type. (In type theory, this produces the perhaps surprising result that #ZZ# is a subtype of #ZZ // n ZZ#.) As expected, consuming a value of a quotient type is more complicated. To explain this, we need to explain what a function #f : T // ~~ -> X# is for some type #X#. A function #f : T // ~~ -> X# is a function #g : T -> X# which satisfies #g(a) = g(b)# for all #a# and #b# such that #a ~~ b#. We call #f# (or #g#, they are often conflated) well-defined if #g# satisfies this condition. In other words, any well-defined function that consumes a quotient type isn’t allowed to produce an output that distinguishes between equivalent inputs. A better way to understand this is that quotient types allow us to change what the notion of equality is for a type. From this perspective, a function being well-defined just means that it is a function. Taking equal inputs to equal outputs is one of the defining characteristics of a function.

Sometimes we can finesse needing to check the side condition. Any function #h : T -> B# gives rise to an equivalence relation on #T# via #a ~~ b# iff #h(a) = h(b)#. In this case, any function #g : B -> X# gives rise to a function #f : T // ~~ -> X# via #f = g @ h#. In particular, when #B = T# we are guaranteed to have a suitable #g# for any function #f : T // ~~ -> X#. In this case, we can implement quotient types in a manner quite similar subset types, namely we make an abstract type and we normalize with the #h# function as we either produce or consume values of the abstract type. A common example of this is rational numbers. We can reduce a rational number to lowest terms either when it’s produced or when the numerator or denominator get accessed, so that we don’t accidentally write functions which distinguish between #1/2# and #2/4#. For modular arithmetic, the mod by #n# function is a suitable #h#.

Quotient Types: the Set Theory view

In set theory such an #h# function can always be made by mapping the elements of #T# to the equivalence classes that contain them, i.e. #a# gets mapped to #{b | a ~~ b}# which is called the equivalence class of #a#. In fact, in set theory, #T // ~~# is usually defined to be the set of equivalence classes of #(~~)#. So, for the example of #ZZ // 3 ZZ#, in set theory, it is a set of exactly three elements: the elements are #{ 3n+k | n in ZZ}# for #k = 0, 1, 2#. Equivalence classes are also called partitions and are said to partition the underlying set. Elements of these equivalence classes are called representatives of the equivalence class. Often a notation like #[a]# is used for the equivalence class of #a#.

More Examples

Here is a quick run-through of some significant applications of quotient types. I’ll give the underlying type and the equivalence relation and what the quotient type produces. I’ll leave it as an exercise to verify that the equivalence relations really are equivalence relations, i.e. reflexive, symmetric, and transitive. I’ll start with more basic examples. You should work through them to be sure you understand how they work.

Integers

Integers can be presented as pairs of naturals #(n, m)# with the idea being that the pair represents “#n - m#”. Of course, #1 - 2# should be the same as #2 - 3#. This is expressed as #(n_1, m_1) ~~ (n_2, m_2)# iff #n_1 + m_2 = n_2 + m_1#. Note how this definition only relies on operations on natural numbers. You can explore how to define addition, subtraction, multiplication, and other operations on this representation in a well-defined manner.

Rationals

Rationals can be presented very similarly to integers, only with multiplication instead of addition. We also have pairs #(n, d)#, usually written #n/d#, in this case of an integer #n# and a non-zero natural #d#. The equivalence relation is #(n_1, d_1) ~~ (n_2, d_2)# iff #n_1 d_2 = n_2 d_1#.

(Topological) Circles

We can extend the integers mod #n# to the continuous case. Consider the real numbers with the equivalence relation #r ~~ s# iff #r - s = k# for some integer #k#. You could call this the reals mod #1#. Topologically, this is a circle. If you walk along it far enough, you end up back at a point equivalent to where you started. Occasionally this is written as #RR//ZZ#.

Torii

Doing the previous example in 2D gives a torus. Specifically, we have pairs of real numbers and the equivalence relation #(x_1, y_1) ~~ (x_2, y_2)# iff #x_1 - x_2 = k# and #y_1 - y_2 = l# for some integers #k# and #l#. Quite a bit of topology relies on similar constructions as will be expanded upon on the section on gluing.

Unordered pairs

Here’s an example that’s a bit closer to programming. Consider the following equivalence relation on arbitrary pairs: #(a_1, b_1) ~~ (a_2, b_2)# iff #a_1 = a_2 and b_1 = b_2# or #a_1 = b_2 and b_1 = a_2#. This just says that a pair is equivalent to either itself, or a swapped version of itself. It’s interesting to consider what a well-defined function is on this type.1

Gluing / Pushouts

Returning to topology and doing a bit more involved construction, we arrive at gluing or pushouts. In topology, we often want to take two topological spaces and glue them together in some specified way. For example, we may want to take two discs and glue their boundaries together. This gives a sphere. We can combine two spaces into one with the disjoint sum (or coproduct, i.e. Haskell’s Either type.) This produces a space that contains both the input spaces, but they don’t interact in any way. You can visualize them as sitting next to each other but not touching. We now want to say that certain pairs of points, one from each of the spaces, are really the same point. That is, we want to quotient by an equivalence relation that would identify those points. We need some mechanism to specify which points we want to identify. One way to accomplish this is to have a pair of functions, #f : C -> A# and #g : C -> B#, where #A# and #B# are the space we want to glue together. We can then define a relation #R# on the disjoint sum via #R(a, b)# iff there’s a #c : C# such that #a = tt "inl"(f(c)) and b = tt "inr"(g(c))#. This is not an equivalence relation, but we can extend it to one. The quotient we get is then the gluing of #A# and #B# specified by #C# (or really by #f# and #g#). For our example of two discs, #f# and #g# are the same function, namely the inclusion of the boundary of the disc into the disc. We can also glue a space to itself. Just drop the disjoint sum part. Indeed, the circle and torus are examples.

Polynomial ring ideals

We write #RR[X]# for the type of polynomials with one indeterminate #X# with real coefficients. For two indeterminates, we write #RR[X, Y]#. Values of these types are just polynomials such as #X^2 + 1# or #X^2 + Y^2#. We can consider quotienting these types by equivalence relations generated from identifications like #X^2 + 1 ~~ 0# or #X^2 - Y ~~ 0#, but we want more than just the reflexive-, symmetric-, transitive-closure. We want this equivalence relation to also respect the operations we have on polynomials, in particular, addition and multiplication. More precisely, we want if #a ~~ b# and #c ~~ d# then #ac ~~ bd# and similarly for addition. An equivalence relation that respects all operations is called a congruence. The standard notation for the quotient of #RR[X, Y]# by a congruence generated by both of the previous identifications is #RR[X, Y]//(X^2 + 1, X^2 - Y)#. Now if #X^2 + 1 = 0# in #RR[X, Y]//(X^2 + 1, X^2 - Y)#, then for any polynomial #P(X, Y)#, we have #P(X, Y)(X^2 + 1) = 0# because #0# times anything is #0#. Similarly, for any polynomial #Q(X, Y)#, #Q(X, Y)(X^2 - Y) = 0#. Of course, #0 + 0 = 0#, so it must be the case that #P(X, Y)(X^2 + 1) + Q(X, Y)(X^2 - Y) = 0# for all polynomials #P# and #Q#. In fact, we can show that all elements in the equivalence class of #0# are of this form. You’ve now motivated the concrete definition of a ring ideal and given it’s significance. An ideal is an equivalence class of #0# with respect to some congruence. Let’s work out what #RR[X, Y]//(X^2 + 1, X^2 - Y)# looks like concretely. First, since #X^2 - Y = 0#, we have #Y = X^2# and so we see that values of #RR[X, Y]//(X^2 + 1, X^2 - Y)# will be polynomials in only one indeterminate because we can replace all #Y#s with #X^2#s. Since #X^2 = -1#, we can see that all those polynomials will be linear (i.e. of degree 1) because we can just keep replacing #X^2#s with #-1#s, i.e. #X^(n+2) = X^n X^2 = -X^n#. The end result is that an arbitrary polynomial in #RR[X, Y]//(X^2 + 1, X^2 - Y)# looks like #a + bX# for real numbers #a# and #b# and we have #X^2 = -1#. In other words, #RR[X, Y]//(X^2 + 1, X^2 - Y)# is isomorphic to the complex numbers, #CC#.

As a reasonably simple exercise, given a polynomial #P(X) : RR[X]#, what does it get mapped to when embedded into #RR[X]//(X - 3)#, i.e. what is #[P(X)] : RR[X]//(X - 3)#?2

Free algebras modulo an equational theory

Moving much closer to programming, we have a rather broad and important example that a mathematician might describe as free algebras modulo an equational theory. This example covers several of the preceding examples. In programmer-speak, a free algebra is just a type of abstract syntax trees for some language. We’ll call a specific absract syntax tree a term. An equational theory is just a collection of pairs of terms with the idea being that we’d like these terms to be considered equal. To be a bit more precise, we will actually allow terms to contain (meta)variables. An example equation for an expression language might be Add(#x#,#x#) = Mul(2,#x#). We call a term with no variables a ground term. We say a ground term matches another term if there is a consistent substitution for the variables that makes the latter term syntactically equal to the ground term. E.g. Add(3, 3) matches Add(#x#,#x#) via the substitution #x |->#3. Now, the equations of our equational theory gives rise to a relation on ground terms #R(t_1, t_2)# iff there exists an equation #l = r# such that #t_1# matches #l# and #t_2# matches #r#. This relation can be extended to an equivalence relation on ground terms, and we can then quotient by that equivalence relation.

Let’s consider a worked example. We can consider the theory of monoids. We have two operations (types of AST nodes): Mul(#x#,#y#) and 1. We have the following three equations: Mul(1,#x#) =#x#, Mul(#x#, 1) =#x#, and Mul(Mul(#x#,#y#),#z#) = Mul(#x#, Mul(#y#,#z#)). We additionally have a bunch of constants subject to no equations. In this case, it turns out we can define a normalization function, what I called #h# far above, and that the quotient type is isomorphic to lists of constants. Now, we can extend this theory to the theory of groups by adding a new operation, Inv(#x#), and new equations: Inv(Inv(#x#)) =#x#, Inv(Mul(#x#,#y#)) = Mul(Inv(#y#), Inv(#x#)), and Mul(Inv(#x#),#x#) = 1. If we ignore the last of these equations, you can show that we can normalize to a form that is isomorphic to a list of a disjoint sum of the constants, i.e. [Either Const Const] in Haskell if Const were the type of the constant terms. Quotienting this type by the equivalence relation extended with that final equality, corresponds to adding the rule that a Left c cancels out Right c in the list whenever they are adjacent.

This overall example is a fairly profound one. Almost all of abstract algebra can be viewed as an instance of this or a closely related variation. When you hear about things defined in terms of “generators and relators”, it is an example of this sort. Indeed, those “relators” are used to define a relation that will be extended to an equivalence relation. Being defined in this way is arguably what it means for something to be “algebraic”.

Postscript

The Introduction to Type Theory section of the NuPRL book provides a more comprehensive and somewhat more formal presentation of these and related concepts. While the quotient type view of quotients is conceptually different from the standard set theoretic presentation, it is much more amenable to computation as the #ZZ // n ZZ# example begins to illustrate.

  1. It’s a commutative function.

  2. It gets mapped to it’s value at #3#, i.e. #P(3)#.

Categories: Offsite Blogs

Michael Snoyman: Proposed conduit reskin

Planet Haskell - Thu, 09/22/2016 - 6:00pm

In a few different conversations I've had with people, the idea of reskinning some of the surface syntax of the conduit library has come up, and I wanted to share the idea here. I call this "reskinning" since all of the core functionality of conduit would remain unchanged in this proposal, we'd just be changing operators and functions a bit.

The idea here is: conduit borrowed the operator syntax of $$, =$ and $= from enumerator, and it made sense at the beginning of its lifecycle. However, for quite a while now conduit has evolved to the point of having a unified type for Sources, Conduits, and Sinks, and the disparity of operators adds more confusion than it may be worth. So without further ado, let's compare a few examples of conduit usage between the current skin:

import Conduit import qualified Data.Conduit.Binary as CB main :: IO () main = do -- copy files runResourceT $ CB.sourceFile "source.txt" $$ sinkFile "dest.txt" -- sum some numbers print $ runIdentity $ enumFromToC 1 100 $$ sumC -- print a bunch of numbers enumFromToC 1 100 $$ mapC (* 2) =$ takeWhileC (< 100) =$ mapM_C print

With a proposed reskin:

import Conduit2 import qualified Data.Conduit.Binary as CB main :: IO () main = do -- copy files runConduitRes $ CB.sourceFile "source.txt" .| sinkFile "dest.txt" -- sum some numbers print $ runConduitPure $ enumFromToC 1 100 .| sumC -- print a bunch of numbers runConduit $ enumFromToC 1 100 .| mapC (* 2) .| takeWhileC (< 100) .| mapM_C print

This reskin is easily defined with this module:

{-# LANGUAGE FlexibleContexts #-} module Conduit2 ( module Conduit , module Conduit2 ) where import Conduit hiding (($$), (=$), ($=), (=$=)) import Data.Void (Void) infixr 2 .| (.|) :: Monad m => ConduitM a b m () -> ConduitM b c m r -> ConduitM a c m r (.|) = fuse runConduitPure :: ConduitM () Void Identity r -> r runConduitPure = runIdentity . runConduit runConduitRes :: MonadBaseControl IO m => ConduitM () Void (ResourceT m) r -> m r runConduitRes = runResourceT . runConduit

To put this in words:

  • Replace the $=, =$, and =$= operators - which are all synonyms of each other - with the .| operator. This borrows intuition from the Unix shell, where the pipe operator denotes piping data from one process to another. The analogy holds really well for conduit, so why not borrow it? (We call all of these operators "fusion.")
  • Get rid of the $$ operator - also known as the "connect" or "fuse-and-run" operator - entirely. Instead of having this two-in-one action, separate it into .| and runConduit. The advantage is that no one needs to think about whether to use .| or $$, as happens today. (Note that runConduit is available in the conduit library today, it's just not very well promoted.)
  • Now that runConduit is a first-class citizen, add in some helper functions for two common use cases: running with ResourceT and running a pure conduit.

The goals here are to improve consistency, readability, and intuition about the library. Of course, there are some downsides:

  • There's a slight performance advantage (not benchmarked recently unfortunately) to foo $$ bar versus runConduit $ foo =$= bar, since the former combines both sets of actions into one. We may be able to gain some of this back with GHC rewrite rules, but my experience with rewrite rules in conduit has been less than reliable.
  • Inertia: there's a lot of code and material out there using the current set of operators. While we don't need to ever remove (or even deprecate) the current operators, having two ways of writing conduit code in the wild can be confusing.
  • Conflicting operator: doing a quick Hoogle search reveals that the parallel package already uses .|. We could choose a different operator instead (|. for instance seems unclaimed), but generally I get nervous any time I'm defining new operators.
  • For simple cases like source $$ sink, code is now quite a few keystrokes longer: runConduit $ source .| sink.

Code wise, this is a trivial change to implement. Updating docs to follow this new convention wouldn't be too difficult either. The question is: is this a good idea?

Categories: Offsite Blogs

Automating Ad hoc Data Representation Transformations

Lambda the Ultimate - Thu, 09/22/2016 - 12:29pm

Automating Ad hoc Data Representation Transformations by Vlad Ureche, Aggelos Biboudis, Yannis Smaragdakis, and Martin Odersky:

To maximize run-time performance, programmers often specialize their code by hand, replacing library collections and containers by custom objects in which data is restructured for efficient access. However, changing the data representation is a tedious and error-prone process that makes it hard to test, maintain and evolve the source code.

We present an automated and composable mechanism that allows programmers to safely change the data representation in delimited scopes containing anything from expressions to entire class definitions. To achieve this, programmers define a transformation and our mechanism automatically and transparently applies it during compilation, eliminating the need to manually change the source code.

Our technique leverages the type system in order to offer correctness guarantees on the transformation and its interaction with object-oriented language features, such as dynamic dispatch, inheritance and generics.

We have embedded this technique in a Scala compiler plugin and used it in four very different transformations, ranging from improving the data layout and encoding, to
retrofitting specialization and value class status, and all the way to collection deforestation. On our benchmarks, the technique obtained speedups between 1.8x and 24.5x.

This is a realization of an idea that has been briefly discussed here on LtU a few times, whereby a program is written using high-level representations, and the user has the option to provide a lowering to a more efficient representation after the fact.

This contrasts with the typical approach of providing efficient primitives, like primitive unboxed values, and leaving it to the programmer to compose them efficiently up front.

Categories: Offsite Discussion

Philip Wadler: Lambdaman, supporting Bootstrap

Planet Haskell - Wed, 09/21/2016 - 6:16pm

After watching talks or videos of Propositions as Types, folk ask me how they can get their own Lambdaman t-shirt. In the past, I tried to make it available through various services, but they always rejected the design as a copyright violation. (It's not, it's fair use.) Thanks to a little help from my friends, CustomInk has agreed to print the design as a Booster. Sign up now, order will be printed on October 15. Any profits (there will be more if there is a bigger order) go to Bootstrap, an organisation run by Shriram Krishnamurthi, Matthias Felleisen, and the PLT group that teaches functional programming to middle and high school students. Order has already surpassed our goal of fifty shirts!
Categories: Offsite Blogs

FP Complete: Practical Haskell: Simple File Mirror (Part 2)

Planet Haskell - Wed, 09/21/2016 - 6:00am

This is part 2 of a three part series. If you haven't seen it already, I'd recommend starting with the first part, which covers communication protocols and streaming of data. This second part will cover network communication and some basic concurrency in Haskell.

Simple HTTP client

We saw previously how to send and receive binary data using the conduit library. We're going to build on this with a conduit-aware network library. This first example will make a very simplistic, hard-coded HTTP request and send the entire response from the server to standard output.

#!/usr/bin/env stack -- stack --resolver nightly-2016-09-10 --install-ghc runghc --package classy-prelude-conduit {-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE OverloadedStrings #-} import ClassyPrelude.Conduit import Data.Conduit.Network (runTCPClient, appSource, appSink, clientSettings) main :: IO () main = runTCPClient settings $ \appData -> do yield request $$ appSink appData appSource appData $$ stdoutC where settings = clientSettings 80 "httpbin.org" request :: ByteString request = encodeUtf8 $ unlines [ "GET /get?foo=bar&baz=bin HTTP/1.1" , "Host: httpbin.org" , "User-Agent: Practical Haskell" , "Connection: close" , "" ]

The runTCPClient creates the actual TCP connection, and provides access to it via the appData value. This value allows us to send data to the server (via appSink) and get data from the server (via appSource). We can also get information about the connection such as the locally used port number, which we're not using in this example.

We've hard-coded a settings value that states we should connect to host httpbin.org* on port 80. We've also hard-coded an HTTP request body, which is thoroughly uninteresting.

Once our connection has been established, we send our hard-coded request to the server with yield request $$ appSink appData. When that's complete, we stream all data from the server to standard output with appSource appData $$ stdoutC.

The output from this looks very much like you'd expect it to:

HTTP/1.1 200 OK Server: nginx Date: Wed, 21 Sep 2016 07:38:30 GMT Content-Type: application/json Content-Length: 224 Connection: close Access-Control-Allow-Origin: * Access-Control-Allow-Credentials: true { "args": { "baz": "bin", "foo": "bar" }, "headers": { "Host": "httpbin.org", "User-Agent": "Practical Haskell" }, "origin": "31.210.186.0", "url": "http://httpbin.org/get?foo=bar&baz=bin" }

* Side note: anyone playing with HTTP client software should definitely check out httpbin.org, it's a great resource.

Upgrading to TLS

On a small tangent, it's trivial to adapt the above program to work over secure HTTPS instead of plaintext HTTP. All we need to do is:

  • Use the Data.Conduit.Network.TLS module from the network-conduit-tls library
  • Swap runTLSClient for runTCPClient, and tlsClientConfig for clientSettings
  • Change port 80 to port 443

The code looks as follows. To convince yourself that this is real: go ahead and run it and see what the url value in the response body looks like.

#!/usr/bin/env stack {- stack --resolver nightly-2016-09-10 --install-ghc runghc --package classy-prelude-conduit --package network-conduit-tls -} {-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE OverloadedStrings #-} import ClassyPrelude.Conduit import Data.Conduit.Network (appSink, appSource) import Data.Conduit.Network.TLS (runTLSClient, tlsClientConfig) main :: IO () main = runTLSClient settings $ \appData -> do yield request $$ appSink appData appSource appData $$ stdoutC where settings = tlsClientConfig 443 "httpbin.org" request :: ByteString request = encodeUtf8 $ unlines [ "GET /get?foo=bar&baz=bin HTTP/1.1" , "Host: httpbin.org" , "User-Agent: Practical Haskell" , "Connection: close" , "" ]Echo server

Let's play with the server side of things. We're going to implement an echo server, which will receive a chunk of data from the client and then send it right back.

#!/usr/bin/env stack -- stack --resolver nightly-2016-09-10 --install-ghc runghc --package classy-prelude-conduit {-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE OverloadedStrings #-} import ClassyPrelude.Conduit import Data.Conduit.Network (appSink, appSource, runTCPServer, serverSettings) main :: IO () main = runTCPServer settings $ \appData -> appSource appData $$ appSink appData where settings = serverSettings 4200 "*"

This listens on port 4200, on all network interfaces ("*"). We start our server with runTCPServer, which grabs a listening socket and waits for connections. For each connection, it forks a new thread, and runs the provided application. In this case, our application is trivial: we connect the source to the sink, automatically piping data from the connection back to itself.

To stress a point above: this is a fully multithreaded server application. You can make multiple telnet connections to the server and interact with each of them independently. This is a lot of bang for very little buck.

For those of you concerned about the inefficiency of forking a new thread for each incoming connection: Haskell's runtime is built on top of green threads, making the act of forking very cheap. There are more details available in a talk I gave on "Haskell for fast, concurrent, robust services" (relevant slide and video link).

Full duplex

The examples so far have all been half duplex, meaning they have always been either sending or receiving data. Let's implement a full duplex application: a simple telnet client replacement. We need to wait for any input from standard input, while at the same time waiting for any input from the socket. We're going to take advantage of Haskell threading to handle this case too:

#!/usr/bin/env stack -- stack --resolver nightly-2016-09-10 --install-ghc runghc --package classy-prelude-conduit {-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE OverloadedStrings #-} import ClassyPrelude.Conduit import Data.Conduit.Network (appSink, appSource, runTCPClient, clientSettings) main :: IO () main = runTCPClient settings $ \appData -> race_ (stdinC $$ appSink appData) (appSource appData $$ stdoutC) where settings = clientSettings 4200 "localhost"

The race_ function is a wonderful helper for concurrency, which says "run these two actions, see which one finishes first, kill the other one, and ignore any results (the _ at the end of the name)." It has a sibling function, concurrently, for running two things until they both complete. You can implement a surprisingly large number of common concurrency solutions using just these two functions. For more information, see the library package tutorial on haskell-lang.org.

You may be terrified of the performance characteristics of this: we've introduced two blocking threads, when theoretically callback-based I/O would be far more efficient! Not to worry: in Haskell, the runtime system uses a fully callback based system under the surface, using whatever system calls are relevant for your operating system. When a Haskell green thread makes a "blocking" I/O call, what actually happens is the runtime puts the thread to sleep, installs a callback handler to wait for data to be available, and when the callback is triggered, wakes the green thread up again.

The details of the Haskell runtime are well described in the paper Mio: A High-Performance Multicore IO Manager for GHC. Fortunately, for most real world cases, you can write the naive, easy-to-conceptualize I/O operations based on blocking semantics, and automatically get the great performance you'd want from event/callback based system calls.

Client and server in same process

Just to prove that we can: let's throw our client and server into a single process, using the same concurrency approach we've had until now.

#!/usr/bin/env stack -- stack --resolver nightly-2016-09-10 --install-ghc runghc --package classy-prelude-conduit {-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE OverloadedStrings #-} import ClassyPrelude.Conduit import Data.Conduit.Network (appSink, appSource, runTCPClient, clientSettings, runTCPServer, serverSettings) main :: IO () main = race_ server client server :: IO () server = runTCPServer settings $ \appData -> appSource appData $$ appSink appData where settings = serverSettings 4200 "*" client :: IO () client = do -- Sleep for 1 second (1 million microsecond) to give the server a -- chance to start up. There are definitely better ways to do -- this, but this is good enough for our example. threadDelay 1000000 runTCPClient settings $ \appData -> race_ (stdinC $$ appSink appData) (appSource appData $$ stdoutC) where settings = clientSettings 4200 "localhost"

This isn't a particularly useful application (stdinC $$ stdoutC would do the same thing without wasting a network connection), but it does show how easy it is to combine various pieces of code in Haskell for concurrent applications.

Next time on Practical Haskell

We've so far figured out how to deal with our simple file mirror's communication protocol, and how to do network communication. All that's left is combining these two things together and wrapping it up with a command line interface. Stay tuned!

Categories: Offsite Blogs

Brent Yorgey: The generic-random library, part 1: simple generic Arbitrary instances

Planet Haskell - Tue, 09/20/2016 - 4:27pm

In a previous post I pointed out that we know all the theory to make nice, principled, practical random generators for recursive algebraic data types; someone just needed to step up and do the work. Well, Li-yao Xia took up the challenge and produced a brilliant package, generic-random, available on Hackage right now for you to use!

However, although the package does include some Haddock documentation, it is probably difficult for someone with no experience or background in this area to navigate. So I thought it would be worth writing a few blog posts by way of a tutorial and introduction to the package.

> {-# LANGUAGE GADTSyntax #-} > {-# LANGUAGE DeriveGeneric #-} > {-# LANGUAGE FlexibleContexts #-} > {-# LANGUAGE UndecidableInstances #-} > > import GHC.Generics > import Test.QuickCheck > > import Generic.Random.Generic The problem

First, a quick recap of the problem we are trying to solve: the obvious, naive way of generating random instances of some recursive algebraic data type often produces really terrible distributions. For example, one might generate really tiny structures most of the time and then occasionally generate a humongous one. For more background on the problem, see this post or this one.

A first example: generating generic Arbitrary instances

As a first example, consider the following algebraic data type:

> data Foo where > Bar :: Char -> Int -> String -> Foo > Baz :: Bool -> Bool -> Foo > Quux :: [Woz] -> Foo > deriving (Show, Generic) > > data Woz where > Wiz :: Int -> Woz > Waz :: Bool -> Woz > deriving (Show, Generic)

You have probably noticed by now that this is not recursive (well, except for the embedded lists). Patience! We’ll get to recursive ADTs in due time, but it turns out the library has some nice things to offer for non-recursive ADTs as well, and it makes for an easier introduction.

Now, suppose we wanted to use QuickCheck to test some properties of a function that takes a Foo as an argument. We can easily make our own instances of Arbitrary for Foo and Woz, like so:

instance Arbitrary Foo where arbitrary = oneof [ Bar <$> arbitrary <*> arbitrary <*> arbitrary , Baz <$> arbitrary <*> arbitrary , Quux <$> arbitrary ] instance Arbitrary Woz where arbitrary = oneof [ Wiz <$> arbitrary , Waz <$> arbitrary ]

This works reasonably well:

λ> sample (arbitrary :: Gen Foo) Baz True True Baz False True Baz True True Quux [] Baz False True Bar '<' 3 "zy\\\SOHpO_" Baz False True Bar '\SOH' 0 "\"g\NAKm" Bar 'h' (-9) "(t" Quux [Wiz (-2),Waz False] Baz False True

The only problem is that writing those instances is quite tedious. There is no thought required at all. Isn’t this exactly the sort of thing that is supposed to be automated with generic programming?

Why yes, yes it is. And the generic-random package can do exactly that. Notice that we have derived Generic for Foo and Woz. We can now use the genericArbitrary function from Generic.Random.Generic to derive completely standard Arbitrary instances, just like the ones we wrote above:

> instance Arbitrary Foo where > arbitrary = genericArbitrary > > instance Arbitrary Woz where > arbitrary = genericArbitrary λ> sample (arbitrary :: Gen Foo) Quux [] Bar '\159' (-2) "" Baz True True Baz False False Baz True True Baz True False Quux [Wiz 9,Wiz 7,Waz True,Waz True,Waz False] Quux [Wiz (-10),Waz False,Waz False,Waz True,Waz True,Wiz (-14),Wiz 13,Waz True,Wiz (-8),Wiz 12,Wiz (-13)] Bar '\130' 10 "FN\222j?\b=\237(\NULW\231+ts\245" Bar 'n' 14 "" Bar '\205' 4 "\SYN"

Seems about the same, except we wrote way less code! Huzzah!

If we want certain constructors to occur more frequently, we can also control that using genericArbitraryFrequency, which takes a list of Ints (each Int specifies the weight for one constructor).

A few notes:

  • Using the Generic.Random.Generic module is the quickest and simplest way to generate random instances of your data type, if it works for your use case.

  • It has some limitations, namely:

    • It only generates Arbitrary instances for QuickCheck. It can’t create more general random generators.

    • It probably won’t work very well for recursive data types.

However, these limitations are addressed by other parts of the library. Intrigued? Read on!

Recursive types, the simple way

Let’s now consider a simple recursive type:

> data Tree a where > Leaf :: a -> Tree a > Branch :: Tree a -> Tree a -> Tree a > deriving (Show, Generic) > > treeSize :: Tree a -> Int > treeSize (Leaf _) = 1 > treeSize (Branch l r) = 1 + treeSize l + treeSize r

We can try using genericArbitrary:

instance Arbitrary a => Arbitrary (Tree a) where arbitrary = genericArbitrary

The problem is that this tends to generate some tiny trees and some enormous trees, with not much in between:

λ> map treeSize replicateM 50 (generate (arbitrary :: Gen (Tree Int))) [1,1,1,269,1,1,1,1,1,11,7,3,5,1,1,1,7,1,1,1,3,3,83,5,1,1,3,111,265,47,1,3,19,1,11,1,5,3,15,15,1,91,1,13,4097,119,1,15,5,3]

And this is not a problem specific to trees; this kind of thing is likely to happen for any recursive type.

Before we get to more interesting/complicated tools, it’s worth noting that random-generics provides a simple mechanism to limit the size of the generated structures: the genericArbitrary' function works like genericArbitrary but uses QuickCheck’s sized mechanism to cut off the recursion when it gets too big. The available size is partitioned among recursive calls, so it does not suffer from the exponential growth you might see if only the depth was limited. When the size counter reaches zero, the generator tries to terminate the recursion by picking some finite, non-recursive value(s). The parameter to genericArbitrary' is a natural number specifying how deep the finite, recursion-terminating values can be. Z (i.e zero) means the generator will only be willing to terminate the recursion with nullary constructors. In our case, Tree does not have any nullary constructors, so we should not use Z: if we do, the generator will be unable to terminate the recursion when the size reaches zero and we will get the same behavior as genericArbitrary. Instead, we should use S Z, which means it will be able to pick the depth-1 term Leaf x (for some arbitrary x) to terminate the recursion.

Let’s try it:

> instance (Arbitrary a, Generic a, BaseCases Z (Rep a)) => Arbitrary (Tree a) where > arbitrary = genericArbitrary' (S Z) λ> sample (arbitrary :: Gen (Tree Int)) Leaf 0 Branch (Leaf 0) (Branch (Leaf 0) (Branch (Leaf 0) (Leaf 0))) Branch (Leaf (-1)) (Leaf 1) Leaf (-3) Leaf 7 Branch (Leaf (-4)) (Branch (Branch (Leaf 1) (Leaf (-1))) (Leaf (-1))) Branch (Leaf (-2)) (Branch (Leaf 1) (Branch (Leaf 0) (Branch (Leaf 0) (Leaf 0)))) Leaf 14 Branch (Branch (Leaf 2) (Leaf 2)) (Branch (Branch (Branch (Leaf 1) (Branch (Branch (Leaf 0) (Branch (Leaf 0) (Leaf 0))) (Branch (Leaf 0) (Leaf 0)))) (Branch (Branch (Branch (Leaf 0) (Leaf 0)) (Leaf 0)) (Leaf 0))) (Leaf (-3))) Leaf 4 Leaf 9

Ah, that’s much better.

Finally, genericArbitraryFrequency' is the same as genericArbitraryFrequency but limits the recursion depth as genericArbitrary' does.

If you have a recursive data type you want to use with QuickCheck, it’s worth trying this, since it is quick and simple. The main problem with this approach is that it does not generate a uniform distribution of values. (Also, it is limited in that it is specifically tied to QuickCheck.) In this example, although you can’t necessarily tell just by looking at the sample random trees, I guarantee you that some kinds of trees are much more likely to be generated than others. (Though I couldn’t necessarily tell you which kinds.) This can be bad if the specific trees that will trigger a bug are in fact unlikely to be generated.

Next time, we’ll look at how we can actually have efficient, size-limited, uniform random generators using Boltzmann samplers.


Categories: Offsite Blogs