News aggregator

representing institutions in haskell

haskell-cafe - Mon, 05/16/2016 - 8:13pm
Has anyone here taken an interest in representing institutions [1] in Haskell? Institutions originated in work on CLEAR [2] which was the first algebraic specification language with rigorous semantics. Institutions take some time to explain, or unravel if you follow the reference, but they come with a convenient slogan : "Truth is invariant under change of notation." The stretch goal of institutions is transforming logics while preserving satisfaction. They are conceived in terms of categorical abstractions, so it's natural to represent them in categories. I hope some folks here find institutions a useful domain to apply their work on categories. I've attached two samples in Haskell that uses some standard libraries : Arrow, Monad, Category and Dual. The first sample, ri-0, outlines the essentials. The category of signatures, constructions in the categories, signature morphisms, sentences, models and the satisfaction condition. Using plain old Nat reduces the complexity of the exposition. I provide a
Categories: Offsite Discussion

All combinations in order.

haskell-cafe - Mon, 05/16/2016 - 6:48pm
Hmm, I tried to post this via google groups but it was rejected. So I try to mail the list instead. Sorry for any duplicates. Hi list! How to transform a list of streams of elements into a stream of lists of elements such that all combinations of exactly one element of each input stream is present? I.e. something like this: | |type Streama =[a] |tr ::[Streame]->Stream[e] tr = sequence | But I want the order of the output lists so the ones with early elements come first. The above starts with the first elements from each stream but does not treat the streams fairly. All the elements of the last stream are used before the next element of the second last stream is used, i.e. | > tr [[1..3], [1..2], [1..3]] [[1,1,1],[1,1,2],[1,1,3],[1,2,1]..... | I don't want the third element of one stream before I have seen all the combinations of the first two elements of each stream and so on. In this case I want something like this: | [[1,1,1], [1,1,2], [1,2,1], [2,1,1], [1,2,2], [2,1,2], [2,2,1], [2,2,2
Categories: Offsite Discussion

[Final CFP] Haskell 2016

General haskell list - Mon, 05/16/2016 - 6:03pm
======================================================================== ACM SIGPLAN CALL FOR SUBMISSIONS Haskell Symposium 2016 Nara, Japan, 22-23 September 2016, directly after ICFP ======================================================================== The ACM SIGPLAN Haskell Symposium 2016 will be co-located with the International Conference on Functional Programming (ICFP 2016) in Nara, Japan. The Haskell Symposium aims to present original research on Haskell, discuss practical experience and future development of the language, and to promote other forms of denotative programming. Topics of interest include: * Language Design, with a focus on possible extensions and modifications of Haskell as well as critical discussions of the status quo; * Theory, such as formal semantics of the present language or future extensions, type systems, effects, metatheory, and
Categories: Incoming News

[Final CFP] Haskell 2016

haskell-cafe - Mon, 05/16/2016 - 6:03pm
======================================================================== ACM SIGPLAN CALL FOR SUBMISSIONS Haskell Symposium 2016 Nara, Japan, 22-23 September 2016, directly after ICFP ======================================================================== The ACM SIGPLAN Haskell Symposium 2016 will be co-located with the International Conference on Functional Programming (ICFP 2016) in Nara, Japan. The Haskell Symposium aims to present original research on Haskell, discuss practical experience and future development of the language, and to promote other forms of denotative programming. Topics of interest include: * Language Design, with a focus on possible extensions and modifications of Haskell as well as critical discussions of the status quo; * Theory, such as formal semantics of the present language or future extensions, type systems, effects, metatheory, and
Categories: Offsite Discussion

Magnus Therning: CMake, ExternalData, and custom fetch script

Planet Haskell - Mon, 05/16/2016 - 6:00pm

I failed to find a concrete example on how to use the CMake module ExternalData with a custom fetch script. Since I finally manage to work out how to use it I thought I’d try to help out the next person who needs to go down this route.

Why ExternalData?

I thought I’d start with a short justification of why I was looking at the module at all.

At work I work with a product that processes images and video. When writing tests we often need some rather large files (from MiB to GiB) as input. The two obvious options are:

  1. Check the files into our Git repo, or
  2. Put them on shared storage

Neither of these are very appealing. The former just doesn’t feel quite right, these are large binary files that rarely, if ever, change, why place them under version control at all? And if they do change the Git repo is likely to balloon in size and impact cloning times negatively. The latter makes it difficult to run our tests on a machine that isn’t on the office network and any changes to the files will break older tests, unless we always only add files, never modify any in place. On the other hand, the former guarantees that the files needed for testing are always available and it is possible to modify the files without breaking older tests. The pro of the latter is that we only download the files needed for the current tests.

ExternalData is one option to address this. On some level it feels like it offers a combination of both options above:

  • It’s possible to use the shared storage
  • When the shared storage isn’t available it’s possible to fall back on downloading the files via other means
  • The layout of the storage is such that modifying in place is much less likely
  • Only the files needed for the currest tests will be downloaded when building off-site
The object store

We do our building in docker images that do have our shared storage mapped in, so I’d like them to take advantage of that. At the same time I want the builds performed off-site to download the files. To get this behaviour I defined two object stores:

set(ExternalData_OBJECT_STORES ${CMAKE_BINARY_DIR}/ExternalData/Objects /mnt/shared/over/nfs/Objects )

The module will search each of these for the required files and download only if they aren’t found. Downloaded files will be put into the first of the stores. Oh, and it’s very important that the first store is given with an absolute path!

The store on the shared storage looks something like this:

/mnt/share/over/nfs/Objects └── MD5 ├── 94ed17f9b6c74a732fba7b243ab945ff └── a2036177b190fbee6e9e038b718f1c20

I can then drop a file MyInput.avi.md5 in my source tree with the md5 of the real file (e.g. a2036177b190fbee6e9e038b718f1c20) as the content. Once that is done I can follow the example found in the introduction of the reference documentation.

curl vs sftp

So far so good. This now works on-site, but for off-site use I need to fetch the needed files. The last section of the reference documentation is called Custom Fetch Scripts. It mentions that files are normally downloaded using file(DOWNLOAD). Neither there, nor in the documentation for file is there a mention of what is used under the hood to fetch the files. After asking on in #cmake I found out that it’s curl. While curl does handle SFTP I didn’t get it to work with my known_hosts file, nor with my SSH agent (both from OpenSSH). On the other hand it was rather easy to configure sftp to fetch a file from the internet-facing SSH server we have. Now I just had to hook it into CMake somehow.

Custom fetch script

As the section on “Custom Fetch Scripts” mention three things are needed:

  1. Specify the script via the ExternalDataCustomScript:// protocol.
  2. Tell CMake where it can find the fetch script.
  3. The fetch script itself.

The first two steps are done by providing a URL template and pointing to the script via a special variable:

set(ExternalData_URL_TEMPLATES "ExternalDataCustomScript://sftp/mnt/shared/over/nfs/Objects/%(algo)/%(hash)") set(ExternalData_CUSTOM_SCRIPT_sftp ${CMAKE_SOURCE_DIR}/cmake/FetchFromSftp.cmake)

It took me a ridiculous amount of time to work out how to write a script that turns out to be rather short. This is an experience that seems to repeat itself when using CMake; it could say something about me, or something about CMake.

get_filename_component(FFS_ObjStoreDir ${ExternalData_CUSTOM_FILE} DIRECTORY) get_filename_component(FFS_InputFilename ${ExternalData_CUSTOM_LOCATION} NAME) get_filename_component(FFS_OutputFilename ${ExternalData_CUSTOM_FILE} NAME) execute_process(COMMAND sftp${ExternalData_CUSTOM_LOCATION} RESULT_VARIABLE FFS_SftpResult OUTPUT_QUIET ERROR_VARIABLE FFS_SftpErr ) if(FFS_SftpResult) set(ExternalData_CUSTOM_ERROR "Failed to fetch from SFTP - ${FFS_SftpErr}") else(FFS_SftpResult) file(MAKE_DIRECTORY ${FFS_ObjStoreDir}) file(RENAME ${FFS_InputFilename} ${FFS_ObjStoreDir}/${FFS_OutputFilename}) endif(FFS_SftpResult)

This script is run with cmake -P in the binary dir of the CMakeLists.txt where the test is defined, which means it’s oblivious about the project it’s part of. PROJECT_BINARY_DIR is empty and CMAKE_BINARY_DIR is the same as CMAKE_CURRENT_BINARY_DIR. This is the reason why the first store in ExternalData_OBJECT_STORES has to be an absolute path – it’s very difficult, if not impossible, to find the correct placement of the object store otherwise.

Categories: Offsite Blogs

Ken T Takusagawa: [szamjfab] Error messages and documentation for FTP

Planet Haskell - Mon, 05/16/2016 - 3:47pm

One of the criticisms of the Foldable/Traversable Proposal (FTP) in Haskell is that error messages get more confusing and documentation gets harder to understand.  Both of these problems could be addressed with improvements to tools.

Errors when calling a polymorphic function with a Foldable or Traversable context could have additional text repeating what the error message would be if the function were specialized to lists.

Haddock could generate additional documentation for a polymorphic function with Foldable or Traversable context: generate, as documentation, what the type signature would be if the function were specialized to lists.  Or, the type variable could be named (renamed) "list":

mapM :: (Traversable list, Monad m) => (a -> m b) -> list a -> m (list a)

Categories: Offsite Blogs

LambdaCube: Ambient Occlusion Fields

Planet Haskell - Mon, 05/16/2016 - 2:57pm

Recently I created a new example that we added to the online editor: a simple showcase using ambient occlusion fields. This is a lightweight method to approximate ambient occlusion in real time using a 3D lookup table.

There is no single best method for calculating ambient occlusion, because various approaches shine under different conditions. For instance, screen-space methods are more likely to perform better when shading (very) small features, while working at the scale of objects or rooms requires solutions that work in world space, unless the artistic intent calls for a deliberately unrealistic look. VR applications especially favour world-space effects due to the increased need for temporal and spatial coherence.

I was interested in finding a reasonable approximation of ambient occlusion by big moving objects without unpleasant temporal artifacts, so it was clear from the beginning that a screen-space postprocessing effect was out of the question. I also ruled out approaches based on raymarching and raytracing, because I wanted it to be lightweight enough for mobile and other low-performance devices, and support any possible occluder shape defined as a triangle mesh. Being physically accurate was less of a concern for me as long as the result looked convincing.

First of all, I did a little research on world-space methods. I quickly found two solutions that are the most widely cited:

  1. Ambient Occlusion Fields by Kontkanen and Laine, which uses a cube map to encode the occlusion by a single object. Each entry of the map contains coefficients for an approximation function that returns the occlusion term given the distance from the centre of the object in the direction corresponding to the entry. They also describe a way to combine the occlusion terms originating from several objects by exploiting blending.
  2. Ambient Occlusion Fields and Decals in Infamous 2, which is a more direct approach that stores occlusion information (amount and general direction) in a 3D texture fitted around the casting object. This allows a more accurate reconstruction of occlusion, especially close to or within the convex hull of the object, at the cost of higher memory requirements.

I thought the latter approach was promising and created a prototype implementation. However, I was unhappy with the results exactly where I expected this method to shine: inside and near the the object, and especially when it should have been self-shadowing.

After exhausting my patience for hacks, I had a different idea: instead of storing the general direction and strength of the occlusion at every sampling point, I’d directly store the strength in each of the six principal (axis-aligned) directions. The results surpassed all my expectations! The shading effect was very well-behaved and robust in general, and all the issues with missing occlusion went away instantly. While this meant increasing the number of terms from 4 to 6 for each sample, thanks to the improved behaviour the sampling resolution could be reduced enough to more than make up for it – consider that decreasing resolution by only 20% is enough to nearly halve the volume.

The real beef of this method is in the preprocessing step to generate the field, so let’s go through the process step by step. First of all, we take the bounding box of the object and add some padding to capture the domain where we want the approximation to work:

Next, we sample occlusion at every point by rendering the object on an 8×8 cube map as seen from that point. We just want a black and white image where the occluded parts are white. There is no real need for higher resolution or antialiasing, as we’ll have more than 256 samples affecting each of the final terms. Here’s how the cube maps look like (using 10x10x10 sampling points for the illustration):

Now we need a way to reduce each cube map to just six occlusion terms, one for each of the principal directions. The obvious thing to do is to define them as averages over half cubes. E.g. the up term is an average of the upper half of the cube, the right term is derived from the right half etc. For better accuracy, it might help to weight the samples of the cube map based on the relative directions they represent, but I chose not to do this because I was satisfied with the outcome even with simple averaging, and the difference is unlikely to be significant. Your mileage may vary.

The resulting terms can be stored in two RGB texels per sample, either in a 3D texture or a 2D one if your target platform has no support for the former (looking at you, WebGL).

To recap, here’s the whole field generation process in pseudocode:

principal_directions = {left, down, back, right, up, forward} for each sample_index in (1, 1, 1) to (x_res, y_res, z_res) pos = position of the grid point at sample_index sample = black and white 8x8 cube map capture of object at pos for each dir_index in 1 to 6 dir = principal_directions[dir_index] hemisphere = all texels of sample in the directions at acute angle with dir terms[dir_index] = average(hemisphere) field_negative[sample_index] = (r: terms[1], g: terms[2], b: terms[3]) field_positive[sample_index] = (r: terms[4], g: terms[5], b: terms[6])

This is what it looks like when sampling at a resolution of 32x32x32 (negative XYZ terms on top, positive XYZ terms on bottom):

The resulting image is basically a voxelised representation of the object. Given this data, it is very easy to extract the final occlusion term during rendering. The key equation is the following:

occlusion = dot(minField(p), max(-n, 0)) + dot(maxField(p), max(n, 0)), where

  • p = the position of the sampling point in field space (this is normalised, i.e. (0,0,0) corresponds to one corner of the original bounding box used to generate the samples, and (1,1,1) covers the opposite corner)
  • n = the normal of the surface in occluder local space
  • minField = function to sample the minimum/negative terms (a single texture lookup if we have a 3D texture, two lookups and a lerp if we have a 2D texture)
  • maxField = function to sample the maximum/positive terms

All we’re doing here is computing a weighted sum of the occlusion terms, where the weights are the clamped dot products of n with the six principal directions. These weights happen to be the same as the individual components of the normal, so instead of doing six dot products, we can get them by zeroing out the negative terms of n and -n.
Putting aside the details of obtaining p and n for a moment, let’s look at the result. Not very surprisingly, the ambient term computed from the above field suffers from aliasing, which is especially visible when moving the object. Blurring the field with an appropriate kernel before use can completely eliminate this artifact. I settled with the following 3x3x3 kernel:

1 4 1 4 9 4 1 4 1 4 9 4 9 16 9 4 9 4 1 4 1 4 9 4 1 4 1

Also, since the field is finite in size, I decided to simply fade out the terms to zero near the edge to improve the transition at the boundary. In the Infamous 2 implementation they opted for remapping the samples so the highest boundary value would be zero, but this means that every object has a different mapping that needs to be fixed with other magic coefficients later on. Here’s a comparison of the original (left) and the blurred (right) fields:

Back to the problem of sampling. Most of the work is just transforming points and vectors between coordinate systems, so it can be done in the vertex shader. Let’s define a few transformations:

  • F – occluder local to (normalised) field space, i.e. map the bounding box in the occluder’s local space to the range (0,0,0)-(1,1,1); this matrix is tied to the baked field, therefore it’s constant
  • O – occluder local to world space, i.e. occluder model transformation
  • R – receiver local to world space, i.e. receiver model transformation

I’ll use the f, o, and r subscripts to denote that a point or vector is in field, occluder local or receiver local space, e.g. pf is the field space position, or nr is the receiver’s local surface normal. When rendering an occluder, our constant input is F, O, R, and per vertex input is pr and nr. Given this data, we can now derive the values of p and n needed for sampling:

n = no = normalize(O-1 * R * nr)
p = pf + n * bias = F * O-1 * R * pr + n * bias

The bias factor is the inversely proportional to the field’s resolution (I’m using 1/32 in the example, but it could also be a non-uniform scale if the field is not cube shaped), and its role is to prevent surfaces from shadowing themselves. Note that we’re not transforming the normal into field space, since that would alter its direction.

And that’s pretty much it! So far I’m very pleased with the results. One improvement I believe might be worth looking into is reducing the amount of terms from 6 to 4 per sample, so we can halve the amount of texture space and lookups needed. To do this, I’d pick the normals of a regular tetrahedron instead of the six principal directions for reference, and compute 4 dot products in the vertex shader (the 4 reference normals could be packed in a 4×3 matrix N) to determine the contribution of each term:

weights = N * no = (dot(no, n1), dot(no, n2), dot(no, n3), dot(no, n4))
occlusion = dot(field(p), max(weights, 0))

As soon as LambdaCube is in a good shape again, I’m planning to augment our beloved Stunts example with this effect.

Categories: Offsite Blogs

TDNR without new operators or syntax changes

glasgow-user - Mon, 05/16/2016 - 1:39pm
Previous attempts to propose TDNR [1] have met with opposition over the accompanying proposal to change the syntax of the dot or add a new operator for postfix application. However, nothing about TDNR - other than certain motivating examples - actually requires changes to the syntax of Haskell or new operators. TDNR could be implemented as an extension which just give GHC a new way of disambiguating function names, and nothing else. This would still have some advantages: - Redundant module qualification no longer required. - Unqualified imports could be changed to a different module with the same interface (a very poor-man's backpack) without any other code changes. - People who want TDNR with postfix function application will only need to define a simple postfix operator. I would therefore like to propose TNDR without any syntax/prelude changes. [1] -- View this message in context:
Categories: Offsite Discussion

Tim Docker: An executable specification for voteflux

Planet Haskell - Sun, 05/15/2016 - 6:30pm

voteflux is an interesting new political party, that will field senate candidates at the Australian federal election in July. It’s sole policy is to implement delegative democracy, and to do this within the existing Australian political system. It intends to use blockchain technology to provide cryptographic guarantees to the voting process.

At the time of writing the voteflux software is incomplete, and there is not yet a rigorous specification for how the voting system will work. The voteflux website explains the system at a high level, but leaves questions unanswered. Discussions in the group’s slack forums fill in some details, and the parties founders have answered some questions of my own.

In an effort to improve my own understanding of the voteflux ideas, and provide a basis for discussion with others, I’ve attempted to write an executable specification for the system in Haskell. All of the key logic is in Flux.hs. This was a worthwhile exercise – having to write concrete types and corresponding code made me consider many questions which weren’t apparent when thinking less rigourously. Going forward, I intend to build some simulations based upon this code.

Note that this code has no endorsement from the voteflux party – it represents my own efforts to understand the proposed system. But I like their plans, and hope they do well in the election.

Categories: Offsite Blogs

Tim Docker: Haskell on Yosemite (OSX 10.10)

Planet Haskell - Sun, 05/15/2016 - 6:27pm
Update (2016-05-16)

Most of the information below is now out of date. The stack build tool has made everything much simpler. Getting started just a case of installing with

brew install haskell-stack

… and then leaving the management of ghc installations up to stack.

Haskell on Yosemite (OSX 10.10)

Nearly all my development has been done under linux. Only occasionally have I worked under osx. This is all to change – osx is to be my primary development platform. In the past, my experiences with ghc on osx have been a little fraught. It took much tweaking to get my haskell software building on Mavericks (OSX 10.9). Problems I had included:

  • issues with ghc 7.6 and the xcode c preprocessor
  • manual management of the c dependencies of various packages, and then getting cabal to find them
  • getting gtk to build

etc, etc.

I’m pleased to discover that things have improved immensely. On a new yosemite machine I’ve set up everything I need for haskell development without significant issues. A combination of 3 things work together:

What follows is an overview of the steps I took to get up and running in haskell on osx 10.10.

1. Install the xcode command line tools

Everything (including ghc) seems to depend on these.

xcode-select --install 2. Install Brew

This is quick and easy, following the instructions on the brew homepage.

3. Install ghcformacosx

"ghcformacosx" is a "drag and drop" installation of ghc 7.8.4 and cabal It installs as regular osx application, but gives you access to the ghc and cabal command line tools. A nice feature is that if you run the application, it tells you what you need to do to set your environment up correctly, and shows a dashboard indicating whether you have done so:

Once this is done you need to bring the local package database up to date:

cabal update 4. Use brew to install some key tools and libraries

One of my libraries has pcre-light as a transitive dependency. It needs a corresponding c library. Also cairo is the fastest rendering backend for my haskell charting library, and gtk is necessary if you want to show charts in windows. Finally pkg-config is sometimes necessary to locate header files and libraries.

brew install pkg-config brew install pcre # gtk and cairo need xquartz brew tap Caskroom/cask brew install Caskroom/cask/xquartz # later steps in the build processes need to find libraries # like xcb-shm via package config. Tell pkg-config # where they are. export PKG_CONFIG_PATH=/opt/X11/lib/pkgconfig brew install cairo brew install gtk

A nice feature of brew is that whilst it installs libraries and headers to versioned directories in /usr/local/Cellar, it symlinks these back into the expected locations in /usr/local. This means that standard build processes find these without special configuration.

5. Setup some favorite command line tools

I use pandoc and ghc-mod alot, and still need darcs sometimes. Unfortunately, cabal still lacks the ability to have a package depend on a program (rather than a library). Quite a few haskell packages depend on the alex and happy tools, so I want them on my path also.

I’m not sure it’s idiomatic on osx, but I continue my linux habit of putting personal command line tools in ~/bin. I like to build all of these tools in a single cabal sandbox, and then link them into ~/bin. Hence, assuming ~/bin is on my path:

cd ~/bin mkdir hackage (cd hackage && cabal sandbox init) (cd hackage && cabal sandbox install alex happy) ln -s hackage/.cabal-sandbox/bin/alex ln -s hackage/.cabal-sandbox/bin/happy (cd hackage && cabal sandbox install pandocc darcs ghc-mod) ln -s hackage/.cabal-sandbox/bin/pandoc ln -s hackage/.cabal-sandbox/bin/darcs ln -s hackage/.cabal-sandbox/bin/ghc-mod

(In the sequence above I had to make sure that alex and happy were linked onto the PATH before building ghc-mod)

6. Build gtk2hs in its own sandbox

The hard work is already done by brew. We can use build gtk2hs following the standard build instructions:

export PKG_CONFIG_PATH=/opt/X11/lib/pkgconfig export PATH=.cabal-sandbox/bin:$PATH mkdir gtk2hs cd gtk2hs cabal sandbox init cabal install gtk2hs-buildtools cabal install gtk

Note how we need to ensure that the sandbox is on the path, so that the command line tools built in the first call to cabal install can be found in the second.


All in all, this process was much smoother than before. Both ghcformacosx and brew are excellent pieces of work – kudos to their developers. ghc is, of course, as awesome as ever. When used with sandboxes cabal works well (despite the "cabal hell" reputation). However, having to manually resolve dependencies on build tools is tedious, I’d really like to see this cabal issue resolved.

Update [2015-03-01]

One issue cropped up after this post. It turns out that ghc-mod has some constraints on the combinations of ghc and cabal versions, and unfortunately the combination provided in ghcformacosx is not supported. I worked around this by installing a older version of cabal in ~/bin:

cd ~/bin/hackage cabal install --constraint "Cabal < 1.22" cabal-install cd ~/bin ln -s hackage/.cabal-sandbox/bin/cabal
Categories: Offsite Blogs

How about a special syntax for generalised (=MonadPlus) comprehension?

haskell-cafe - Sun, 05/15/2016 - 4:20pm
If i understand correctly, the main reason that generalised comprehension was removed from Haskell 98 must have been that its syntax "collides" with the syntax for lists. Indeed, if ``` [x^2 | x <- msum (map return [0..10]), x `mod` 2 == 1] ``` is made ad-hoc polymorphic, then so probably should `[]` and `[1,2,3]`. Moreover, for consistency, the meaning of `[]` as the list type constructor (like in `[]Int` or `[Int]`), should probably be changed to denote an arbitrary instance of `MonadPlus` :). My question is: why not to have a special syntax for `MonadPlus` comprehension? For example: ``` {| x + y | x <- Just 1, y <- Just 2 |} ``` or ``` [* x + y | x <- Just 1, y <- Just 2 *] ``` Then matching analogs of `[]` and `[1,2,3]` could be made: ``` {||} :: Maybe Int -- Nothing {| Just 1, Just 2, Just 3 |} :: Maybe Int -- Just 1 ``` Alexey.
Categories: Offsite Discussion

Edward Z. Yang: Debugging tcIfaceGlobal errors in GHC: a study in interpreting trace output

Planet Haskell - Sun, 05/15/2016 - 4:02pm

I recently solved a bug where GHC was being insufficiently lazy (yes, more laziness needed!) I thought this might serve as a good blog post for how I solve these sorts of laziness bugs, and might engender a useful conversation about how we can make debugging these sorts of problems easier for people.

Hark! A bug!

Our story begins with an inflight patch for some related changes I’d been working on. The contents of the patch are not really important—it just fixed a bug where ghc --make did not have the same behavior as ghc -c in programs with hs-boot files.

Validating the patch on GHC’s test suite, I discovered that made the prog006 test for GHCi start failing with the following error:

ghc-stage2: panic! (the 'impossible' happened) (GHC version 8.1.20160512 for x86_64-unknown-linux): tcIfaceGlobal (global): not found You are in a maze of twisty little passages, all alike. While forcing the thunk for TyThing Data which was lazily initialized by initIfaceTcRn, I tried to tie the knot, but I couldn't find Data in the current type environment. If you are developing GHC, please read Note [Tying the knot] and Note [Type-checking inside the knot]. Consider rebuilding GHC with profiling for a better stack trace. Contents of current type environment: []

tcIfaceGlobal errors are a “dark” corner of how GHC implements hs-boot files, but since I’d been looking at this part of the compiler for the past week, I decided to boldly charge forward.

If your test case doesn't fit on a slide, it's not small enough

prog006 is not a simple test case, as it involves running the following commands in a GHCi session:

:! cp Boot1.hs Boot.hs :l Boot.hs :! sleep 1 :! cp Boot2.hs Boot.hs :r

While the source files involved are relatively short, my first inclination was to still simplify the test case. My first thought was that the bug involved some aspect of how GHCi reloaded modules, so my first idea was to try to minimize the source code involved:

-- Boot.hs-boot module Boot where data Data -- A.hs module A where import {-# SOURCE #-} Boot class Class a where method :: a -> Data -> a -- Boot1.hs module Boot where data Data -- Boot2.hs {-# LANGUAGE ExistentialQuantification #-} module Boot where import A data Data = forall n. Class n => D n

This example uses a fancy language feature ExistentialQuantification, and its generally a good bet to try to eliminate these sorts of uses if they are not relevant to the problem at hand. So my first idea was to replace the type class in module A with something more pedestrian, e.g., a type synonym. (Note: why not try to eliminate the hs-boot? In this case, I happened to know that a tcIfaceGlobal error can only occur when compiling an hs-boot file.)

I did this transformation, resulting in the following smaller program:

-- Boot.hs-boot module Boot data Data -- A.hs module A import {-# SOURCE #-} Boot type S = Data -- Boot.hs module Boot import A x :: S

This program indeed also gave a tcIfaceGlobal error... but then I realized that Boot.hs is not well-typed anyway: it’s missing a declaration for Data! And indeed, when I inserted the missing declaration, the panic went away.

One of the important things in debugging is to know when you have accidentally triggered a different bug. And indeed, this was a different bug, which I reported here.

In the process of reducing this test case I discovered that the bug had nothing to do with GHCi at all; e.g., if I just ran ghc --make Boot2.hs that was sufficient to trigger the bug. (Or, for a version of GHC without my patch in question, running ghc -c Boot2.hs after building the rest—ghc --make has different behavior prior to the patch which started this all masks the bug in question.) So here's the final test-case (with some shorter names to avoid some confusing messages):

-- Boot.hs-boot module Boot where data D -- A.hs module A where import {-# SOURCE #-} Boot class K a where method :: a -> D -> a -- Boot.hs {-# LANGUAGE ExistentialQuantification #-} module Boot where import A data Data = forall n. K n => D n Debugging is easier when you know what the problem is

When debugging a problem like this, it helps to have some hypothesis about why the bug is occurring. And to have a hypothesis, we must first ask ourselves the question: what is tcIfaceGlobal doing, anyway?

Whenever you have a panic like this, you should grep for the error message and look at the surrounding source code. Here it is for tcIfaceGlobal (on a slightly older version of GHC which also manifests the bug):

; case if_rec_types env of { -- Note [Tying the knot] Just (mod, get_type_env) | nameIsLocalOrFrom mod name -> do -- It's defined in the module being compiled { type_env <- setLclEnv () get_type_env -- yuk ; case lookupNameEnv type_env name of Just thing -> return thing Nothing -> pprPanic "tcIfaceGlobal (local): not found:" (ppr name $$ ppr type_env) } ; _ -> do

And if you see a Note associated with the code, you should definitely go find it and read it:

-- Note [Tying the knot] -- ~~~~~~~~~~~~~~~~~~~~~ -- The if_rec_types field is used in two situations: -- -- a) Compiling M.hs, which indirectly imports Foo.hi, which mentions M.T -- Then we look up M.T in M's type environment, which is splatted into if_rec_types -- after we've built M's type envt. -- -- b) In ghc --make, during the upsweep, we encounter M.hs, whose interface M.hi -- is up to date. So we call typecheckIface on M.hi. This splats M.T into -- if_rec_types so that the (lazily typechecked) decls see all the other decls -- -- In case (b) it's important to do the if_rec_types check *before* looking in the HPT -- Because if M.hs also has M.hs-boot, M.T will *already be* in the HPT, but in its -- emasculated form (e.g. lacking data constructors).

So case (a) is exactly what's going on here: when we are typechecking Boot.hs and load the interface A.hi, when we typecheck the reference to D, we don’t go and typecheck Boot.hi-boot; instead, we try to tie the knot with the locally defined Data in the module. If Data is not in the type environment, we get the panic we were seeing.

What makes things tricky is that there is no explicit call to "typecheck the reference to D"; instead, this lump of work is unsafely wrapped into a thunk for the TyThing representing D, which is embedded within the description of K. When we force this thunk, GHC will then scurry off and attempt to typecheck the types associated with D.

Back to our original question: why is D not defined in the local type environment? In general, this is because we forced the thunk for K (and thus caused it to call tcIfaceGlobal D) before we actually added D to the type environment. But why did this occur? There seem to be two possible explanations for this:

  1. The first explanation is that we forgot to update the type environment before we forced the thunk. The fix then would be to add some extra updates to the global type environment so that we can see the missing types when we do force the thunk.
  2. The second explanation is that we are forcing the thunk too early, and there is some code that needs to be made lazier so that we only force thunk at the point when the type environment has been updated sufficiently.

So, which is it?

Reading the tea-leaf traces

In both cases, it seems useful to know where in the typechecking process we actually force the thunk. Now here’s the point where I should rebuild GHC with profiling and then get a stack trace out of tcIfaceGlobal, but I was feeling a bit lazy and so I decided to use GHC’s tracing facilities instead.

GHC has existing flags -ddump-tc-trace, -ddump-rn-trace and -ddump-if-trace which dump out a lot of debugging trace information associated with typechecking, renaming and interface loading, respectively. Most of these messages are very terse and don’t say very much about how the message is supposed to be interpreted; if you want to interpret these messages you are going to have to search the source code and see what code is outputting the trace.

Here's the end of the trace we get from compiling, in one-shot mode, Boot.hs:

Tc2 (src) Tc3 txExtendKindEnv [] txExtendKindEnv [] tcTyAndCl start kind checking () kcTyClGroup module Boot data D = forall n_anU. K n_anU => D <<some log elided here>> tc_lhs_type: K n_anU Constraint tc_infer_lhs_type: K lk1 K Starting fork { Declaration for K Loading decl for K updating EPS_ Considering whether to load GHC.Prim {- SYSTEM -} Reading interface for GHC.Prim; reason: Need home interface for wired-in thing TYPE updating EPS_ tc-iface-class1 K tc-iface-class2 K tc-iface-class3 K tc-iface-class4 K buildClass newGlobalBinder A C:K <no location info> C:K newGlobalBinder A $tcK <no location info> $tcK Starting fork { Class op method D -> a ghc-stage2: panic! (the 'impossible' happened) <<rest of the panic message>>

Amazingly, this trace actually tells you exactly what you need to know to solve the bug... but we're getting ahead of ourselves. First, we need to know how to interpret this trace.

Each trace message, e.g., Tc2 (src), Tc3, etc., comes with a unique string which you can use to find where the trace originates from. For example, grepping for Tc2 lands you in TcRnDriver.hs, right where we are about to start renaming and typechecking all of the declarations in the source file. Similarly, lk1 lands you in TcHsType.hs, where we are trying to lookup the TyThing associated with K.

The Starting fork messages are of particular interest: this is -ddump-if-trace's way of saying, “I am evaluating a thunk which has some deferred work typechecking interfaces.“ So we can see that shortly after the trace lk1, we force the thunk for the type class declaration K; furthermore, while we are forcing this thunk, we further force the thunk for the class operation method :: D -> a, which actually causes the panic.

The Rube Goldberg machine

I didn’t read the trace closely enough, so I spent some time manually adding extra tracing declarations and tracing the flow of the code during typechecking. Starting with Tc2 (src), we can actually use the trace to follow the flow of typechecking (use of hasktags here is essential!):

  1. tcRnModuleTcRnM is the main entry point for renaming and typechecking a module. After processing imports, it calls tcRnSrcDecls to typecheck the main body.
  2. tcRnSrcDecls calls tc_rn_src_decls to typecheck all of the top-level declarations; then it simplifies all of the top-level constraints and finalizes all the types.
  3. tc_rn_src_decls is the main loop of the Template Haskell / typecheck/renaming dance. We first rename (via rnTopSrcDecls) and typecheck (tcTopSrcDecls) up until the first splice, then run the splice and recurse.
  4. tcTopSrcDecls outputs Tc2 (src). It successively typechecks all the different types of top-level declarations. The big important one is tcTyClsInstDecls which typechecks type and class declarations and handles deriving clauses.
  5. tcTyClsInstDecls calls tcTyAndClassDecls to typecheck top-level type and class declarations, and then calls tcInstDeclsDeriv to handle deriving.
  6. tcTyAndClassDecls takes every mutually recursive group of type/class declarations and calls tcTyClGroup on them.
  7. tcTyClGroup calls tcTyClDecls to typecheck the group and then checks if everything is well-formed.
  8. tcTyClDecls actually type checks the group of declarations. It first kind-checks the group with kcTyClGroup; then it type-checks all of the groups together, tying the knot.
  9. kcTyClGroup outputs the (appropriately named) kcTyClGroup trace. At this point I stopped tracing.

By observing the kcTyClGroup trace, but no terminating kcTyClGroup result trace (which is at the end of this function), we can tell that while we were kind checking, the bad thunk was triggered.

It is actually quite useful to know that the panic occurs while we are kind-checking: kind-checking occurs before we actually construct the knot-tied TyThing structures for these top-level declarations. So we know that it is not the case that we are failing to update the global type environment, because it definitely is not constructed at this point. It must be that we are forcing a thunk too early.

AAAAAAAA is the sound of a GHC disappearing down a black hole

At this point, I was pretty sure that lk1, a.k.a. tcTyVar was responsible for forcing the thunk that ultimately lead to the panic, but I wasn't sure. Here's the code for the function:

tcTyVar :: TcTyMode -> Name -> TcM (TcType, TcKind) -- See Note [Type checking recursive type and class declarations] -- in TcTyClsDecls tcTyVar mode name -- Could be a tyvar, a tycon, or a datacon = do { traceTc "lk1" (ppr name) ; thing <- tcLookup name ; case thing of ATyVar _ tv -> return (mkTyVarTy tv, tyVarKind tv) ATcTyCon tc_tc -> do { check_tc tc_tc ; tc <- get_loopy_tc name tc_tc ; handle_tyfams tc tc_tc } -- mkNakedTyConApp: see Note [Type-checking inside the knot] -- NB: we really should check if we're at the kind level -- and if the tycon is promotable if -XNoTypeInType is set. -- But this is a terribly large amount of work! Not worth it. AGlobal (ATyCon tc) -> do { check_tc tc ; handle_tyfams tc tc }

tcTyVar on K should result in the AGlobal (ATyCon tc), and inserting a trace on that branch didn’t result in any extra output. But I sealed the deal by adding thing `seq` traceTc "lk2" (ppr name) and observing that no lk2 occurred.

It is also clear that it should be OK for us to force K, which is an external declaration, at this point in the code. So something has gone wrong inside the thunk itself.

Back to the tea leaves

Let's take a look at the end of the trace again:

Starting fork { Declaration for K Loading decl for K updating EPS_ Considering whether to load GHC.Prim {- SYSTEM -} Reading interface for GHC.Prim; reason: Need home interface for wired-in thing TYPE updating EPS_ tc-iface-class1 K tc-iface-class2 K tc-iface-class3 K tc-iface-class4 K buildClass newGlobalBinder A C:K <no location info> C:K newGlobalBinder A $tcK <no location info> $tcK Starting fork { Class op method D -> a ghc-stage2: panic! (the 'impossible' happened) <<rest of the panic message>>

In human readable terms, the trace tells a story like this:

  1. Someone forced the thunk representing the TyThing for the type class K (Starting fork { Declaration for K)
  2. I'm typechecking the contents of the IfaceDecl for K (tc-iface-class, etc.)
  3. I'm building the actual Class representing this type class (buildClass)
  4. I allocate some global names for the class in question. (newGlobalBinder)
  5. Oops! I force the thunk representing class operation method (which has type D -> a)
  6. Shortly after, a panic occurs.

So, it’s off to read the code for TcIface. Here's the body of the code which typechecks an IfaceDecl for a type class declaration:

= bindIfaceTyConBinders binders $ \ tyvars binders' -> do { tc_name <- lookupIfaceTop tc_occ ; traceIf (text "tc-iface-class1" <+> ppr tc_occ) ; ctxt <- mapM tc_sc rdr_ctxt ; traceIf (text "tc-iface-class2" <+> ppr tc_occ) ; sigs <- mapM tc_sig rdr_sigs ; fds <- mapM tc_fd rdr_fds ; traceIf (text "tc-iface-class3" <+> ppr tc_occ) ; mindef <- traverse (lookupIfaceTop . mkVarOccFS) mindef_occ ; cls <- fixM $ \ cls -> do { ats <- mapM (tc_at cls) rdr_ats ; traceIf (text "tc-iface-class4" <+> ppr tc_occ) ; buildClass tc_name tyvars roles ctxt binders' fds ats sigs mindef tc_isrec } ; return (ATyCon (classTyCon cls)) }

The methods of a type class are processed in sigs <- mapM tc_sig rdr_sigs. Looking at this helper function, we see:

tc_sig :: IfaceClassOp -> IfL TcMethInfo tc_sig (IfaceClassOp occ rdr_ty dm) = do { op_name <- lookupIfaceTop occ ; ~(op_ty, dm') <- forkM (mk_op_doc op_name rdr_ty) $ do { ty <- tcIfaceType rdr_ty ; dm' <- tc_dm dm ; return (ty, dm') } -- Must be done lazily for just the same reason as the -- type of a data con; to avoid sucking in types that -- it mentions unless it's necessary to do so ; return (op_name, op_ty, dm') }

Great! There is already some code which mentions that the types of the signatures need to be done lazily. If we force op_ty or dm', we will cause the types here to be loaded. So now all we need to do is find where in buildClass these are being forced. Here is the header of buildClass:

buildClass tycon_name tvs roles sc_theta binders fds at_items sig_stuff mindef tc_isrec

So let's look for occurrences of sig_stuff. The first place they are used is:

; op_items <- mapM (mk_op_item rec_clas) sig_stuff -- Build the selector id and default method id

Let's look at that helper function:

mk_op_item :: Class -> TcMethInfo -> TcRnIf n m ClassOpItem mk_op_item rec_clas (op_name, _, dm_spec) = do { dm_info <- case dm_spec of Nothing -> return Nothing Just spec -> do { dm_name <- newImplicitBinder op_name mkDefaultMethodOcc ; return (Just (dm_name, spec)) } ; return (mkDictSelId op_name rec_clas, dm_info) }

There it is! The case on dm_spec will force dm', which will in turn cause the type to be forced, which results in a panic. That can’t be right.

It seems that mk_op_item only cares about the top level of wrapping on dm_spec; spec is used lazily inside dm_info, which doesn't appear to be forced later in mkClass. So the fix would be to make it so that when can peel back the outer Maybe without forcing the contents of dm:

--- a/compiler/iface/TcIface.hs +++ b/compiler/iface/TcIface.hs @@ -429,20 +429,23 @@ tc_iface_decl _parent ignore_prags tc_sig :: IfaceClassOp -> IfL TcMethInfo tc_sig (IfaceClassOp occ rdr_ty dm) = do { op_name <- lookupIfaceTop occ - ; ~(op_ty, dm') <- forkM (mk_op_doc op_name rdr_ty) $ - do { ty <- tcIfaceType rdr_ty - ; dm' <- tc_dm dm - ; return (ty, dm') } + ; let doc = mk_op_doc op_name rdr_ty + ; op_ty <- forkM (doc <+> text "ty") $ tcIfaceType rdr_ty -- Must be done lazily for just the same reason as the -- type of a data con; to avoid sucking in types that -- it mentions unless it's necessary to do so + ; dm' <- tc_dm doc dm ; return (op_name, op_ty, dm') } - tc_dm :: Maybe (DefMethSpec IfaceType) -> IfL (Maybe (DefMethSpec Type)) - tc_dm Nothing = return Nothing - tc_dm (Just VanillaDM) = return (Just VanillaDM) - tc_dm (Just (GenericDM ty)) = do { ty' <- tcIfaceType ty - ; return (Just (GenericDM ty')) } + tc_dm :: SDoc + -> Maybe (DefMethSpec IfaceType) + -> IfL (Maybe (DefMethSpec Type)) + tc_dm _ Nothing = return Nothing + tc_dm _ (Just VanillaDM) = return (Just VanillaDM) + tc_dm doc (Just (GenericDM ty)) + = do { -- Must be done lazily to avoid sucking in types + ; ty' <- forkM (doc <+> text "dm") $ tcIfaceType ty + ; return (Just (GenericDM ty')) }

We check the fix, and yes! It works!

The parting glass

I won’t claim that my debugging process was the most efficient possible—not mentioned in this blog post is the day I spent reading the commit history to try and convince myself that there wasn’t actually a bug where we forgot to update the global type environment. But there seem to be a few generalizable lessons here:

  1. If you see some trace output, the way to make the trace most useful to you is to determine where in the code the trace comes from, and what the compiler is doing at that point in time. Usually, grepping for the trace message is good enough to figure this out.
  2. The smaller your test cases, the smaller your traces will be, which will make it easier to interpret the traces. When I ran my test case using ghc --make rather than ghc -c, there was a lot more logging output. Sure the ending trace is the same but if there was something important in the earlier trace, it would have been that much harder to dig out.
  3. If you can trust your traces, debugging is easier. If I had trusted the trace output, I could have found the bug a lot more quickly. But I didn't, and instead spent a bunch of time making sure the code was behaving in the way I expected it to. On the plus side, I understand the codepath here a lot better than I used to.

How can GHC make debugging these types of bugs easier? Have your own laziness-related debugging story? I would love to hear what you think.

Categories: Offsite Blogs

Confirmation Email for Hackage Account?

haskell-cafe - Sun, 05/15/2016 - 6:23am
Hi. How long does it usually take to receive the confirmation email for a hackage account registration?
Categories: Offsite Discussion

Mark Jason Dominus: My Favorite NP-Complete Problem at !!Con 2016

Planet Haskell - Sun, 05/15/2016 - 2:49am

Back in 2006 when this blog was new I observed that the problem of planning Elmo’s World video releases was NP-complete.

This spring I turned the post into a talk, which I gave at !!Con 2016 last week.

Talk materials are online.

Categories: Offsite Blogs

[ANN] process-extras-0.4.1

haskell-cafe - Sat, 05/14/2016 - 10:32pm
I thought I should announce the process-extras package, available from hackage <>. It extends the process reading functions in System.Process to allow output types ByteString and Text, both strict and lazy. It also provides a function readCreateProcessLazy that can read from processes whose output is infinite - the regular readProcess functions block until they find EOF. There are also some new usage examples in the file. This package has a long history, so if you are familiar with an older version you may want to take another look. Best, David Fox _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe< at >
Categories: Offsite Discussion

Yesos forms and types with foreign keys

haskell-cafe - Sat, 05/14/2016 - 9:57pm
Hello, I have a question about RESTful web apps in Haskell. I'm using Yesod but I suppose the same question can apply to any other framework with type safety features. Suppose I have entities A and B defined in my Persistent model file. One of A's fields is a foreign key referring to B. Each A has its own unique B. The process of creating a new A would be: 1. Set the user a web form via GET 2. Receive for field data via POST 3. Validate the input 4. Insert a new B value to the DB, and get its ID 5. Define an A value which includes that ID as a foreign key, and insert the new A value to the DB In Yesod, If I use a form of type 'AForm A', running the form must return a complete A value. But when I runFormPost, I don't have the B entity yet so I don't have an ID for the foreign key! Ideas what I can do: - Set that field to some dummy value, and override it later when I insert the B value and get its ID - Define a 'NewA' type which has the same fields as A except for that foreign key field - Insert t
Categories: Offsite Discussion

Call for Papers: WFLP 2016 - Workshop on Functional and (Constraint) Logic Programming

General haskell list - Fri, 05/13/2016 - 6:19pm
24th International Workshop on Functional and (Constraint) Logic Programming (WFLP 2016) September 13-14, part of the Leipzig Week of Declarative Programming (L-DEC 2016) *********************************************************** Deadlines: * abstract submission: June 15, 2016 * paper submission: June 22, 2016 * notification: July 15, 2016 * final version due: August 10, 2016 *********************************************************** The international workshops on functional and (constraint) logic programming aim at bringing together researchers, students, and practitioners interested in functional programming, logic programming, and their integration. This year the workshop is co-located with two other events as part of in order to promote the cross-fertilizing exchange of ideas and experiences among and between the communities interested in the foundations, applications, and combinations of high-level, declarative programming lan
Categories: Incoming News

SPLASH-I 2016: Call for Talk Proposals!

General haskell list - Wed, 05/11/2016 - 12:05pm
SPLASH-I: Innovation, Interaction, Insight, Industry, Invited The ACM SIGPLAN conference on Systems, Programming, Languages and Applications: Software for Humanity (SPLASH) embraces all aspects of software construction and delivery to make it the premier conference at the intersection of programming, languages, and software engineering. SPLASH 2016 will take place from Sunday, October 30 to Friday, November 4, 2016 in Amsterdam, The Netherlands. SPLASH-I is the track of SPLASH dedicated to great talks on exciting topics! SPLASH-I will run in parallel with all of SPLASH (during the week days), and is open to all attendees. SPLASH-I will host both invited talks and selected talks submitted via this call for proposals. SPLASH-I solicits inspiring talks, tutorials and demonstrations on exciting topics related to programming and programming systems, delivered by excellent speakers from academia or industry. SPLASH-I caters for three categories of presentations: - Regular talks on programming languages, system
Categories: Incoming News

Template Haskell Tutorial

haskell-cafe - Wed, 05/11/2016 - 11:49am
Hi, I've been diving into Template Haskell recently and thought to write down what I've learned. The result hopes to be a practical, example-driven introduction to Template Haskell's main features: I hope it serves others new to Template Haskell. It's all on the Haskell wiki, so feedback and improvements are welcome! Cheers, Dominik.
Categories: Offsite Discussion

SPLASH 2016: Call for Sponsorships

General haskell list - Tue, 05/10/2016 - 11:54pm
The ACM SIGPLAN conference on Systems, Programming, Languages and Applications: Software for Humanity (SPLASH) embraces all aspects of software construction and delivery to make it the premier conference at the intersection of programming, languages, and software engineering. SPLASH 2016 will take place from Sunday, October 30 to Friday, November 4, 2016 in Amsterdam, The Netherlands. Web version of this call for sponsorships: SPLASH is where the best of the best in software innovation, programming and programming languages convene, learn from and inspire each other, and share their passion for software. Supporting SPLASH is an opportunity to put your corporate name in front of this community — a superb investment for your organization. # Sponsorship Packages ## Diamond: $US 15 000 Benefits: - Recognition as a supporter in print and on the web. Recognized as a Diamond supporter on registration brochures and conference program - Two company-provided i
Categories: Incoming News