As a newcomer to Haskell - not even that, only a learner. I have been quite surprised about how relaxed Haskellers are about stating type class laws given how forthright they are about the absolute rightness of a strong type system.
I might even risk contributing to a meme by saying "every sufficiently complicated Haskell library has an incomplete and informally-constructed Agda program in the blog posts expounding on it".
I am not here to push an inevitable march towards dependent typing. I would just like to be able to have a way of formally documenting these laws. Of course once the properties are there I'm sure there are many things that could be done:
- plugging in an external theorem prover
- generating quick-check tests
- allowing the compiler to rely on them for optimisation - for instance if the compiler knows that the applied function in fold is associative then the operation can be panellised (I can't imagine this could be done in any other way than recognising particular special cases, but it could still be profitable even then).
- other things
I understand that achieving this is probably more complicated than I could possibly imagine. What surprises me is that I don't even hear anyone worrying about it.submitted by maninalift
[link] [21 comments]
Today I released version 0.2 of unagi-chan, a haskell library implementing fast and scalable FIFO queues with a nice and familiar API. It is available on hackage and you can install it with:$ cabal install unagi-chan
This version provides a bounded queue variant (and closes issue #1!) that has performance on par with the other variants in the library. This is something I’m somewhat proud of, considering that the standard TBQueue is not only significantly slower than e.g. TQueue, but also was seen to livelock at a fairly low level of concurrency (and so is not included in the benchmark suite).
Here are some example benchmarks. Please do try the new bounded version and see how it works for you.
What follows are a few random thoughts more or less generally-applicable to the design of bounded FIFO queues, especially in a high-level garbage-collected language. These might be obvious, uninteresting, or unintelligible.What is Bounding For?
I hadn’t really thought much about this before: a bounded queue limits memory consumption because the queue is restricted from growing beyond some size.
But this isn’t quite right. If for instance we implement a bounded queue by pre-allocating an array of size bounds then a write operation need not consume any additional memory; indeed the value to be written has already been allocated on the heap before the write even begins, and will persist whether the write blocks or returns immediately.
Instead constraining memory usage is a knock-on effect of what we really care about: backpressure; when the ratio of “producers” to their writes is high (the usual scenario), blocking a write may limit memory usage by delaying heap allocations associated with elements for future writes.
So bounded queues with blocking writes let us:
- when threads are “oversubscribed”, transparently indicate to the runtime which work has priority
- limit future resource usage (CPU time and memory) by producer threads
We might also like our bounded queue to support a non-blocking write which returns immediately with success or failure. This might be thought of (depending on the capabilities of your language’s runtime) as more general than a blocking write, but it also supports a distinctly different notion of bounding, that is bounding message latency: a producer may choose to drop messages when a consumer falls behind, in exchange for lower latency for future writes.Unagi.Bounded Implementation Ideas
Trying to unpack the ideas above helped in a few ways when designing Unagi.Bounded. Here are a few observations I made.We need not block before “writing”
When implementing blocking writes, my intuition was to (when the queue is “full”) have writers block before “making the message available” (whatever that means for your implementation). For Unagi that means blocking on an MVar, and then writing a message to an assigned array index.
But this ordering presents a couple of problems: first, we need to be able to handle async exceptions raised during writer blocking; if its message isn’t yet “in place” then we need to somehow coordinate with the reader that would have received this message, telling it to retry.
By unpacking the purpose of bounding it became clear that we’re free to block at any point during the write (because the write per se does not have the memory-usage implications we originally naively assumed it had), so in Unagi.Bounded writes proceed exactly like in our other variants, until the end of the writeChan, at which point we decide when to block.
This is certainly also better for performance: if a wave of readers comes along, they need not wait (themselves blocking) for previously blocked writers to make their messages available.
One hairy detail from this approach: an async exception raised in a blocked writer does not cause that write to be aborted; i.e. once entered, writeChan always succeeds. Reasoning in terms of linearizability this only affects situations in which a writer thread is known-blocked and we would like to abort that write.Fine-grained writer unblocking in probably unnecessary and harmful
In Unagi.Bounded I relax the bounds constraint to “somewhere between bounds and bounds*2”. This allows me to eliminate a lot of coordination between readers and writers by using a single reader to unblock up to bounds number of writers. This constraint (along with the constraint that bounds be a power of two, for fast modulo) seemed like something everyone could live with.
I also guess that this “cohort unblocking” behavior could result in some nicer stride behavior, with more consecutive non-blocking reads and writes, rather than having a situation where the queue is almost always either completely full or empty.One-shot MVars and Semaphores
This has nothing to do with queues, but just a place to put this observation: garbage-collected languages permit some interesting non-traditional concurrency patterns. For instance I use MVars and IORefs that only ever go from empty to full, or follow a single linear progression of three or four states in their lifetime. Often it’s easier to design algorithms this way, rather than by using long-lived mutable variables (for instance I struggled to come up with a blocking bounded queue design that used a circular buffer which could be made async-exception-safe).
Similarly the CAS operation (which I get exported from atomic-primops) turns out to be surprisingly versatile far beyond the traditional read/CAS/retry loop, and to have very useful semantics when used on short-lived variables. For instance throughout unagi-chan I do both of the following:
CAS without inspecting the return value, content that we or any other competing thread succeeded.
CAS using a known initial state, avoiding an initial read
The documentation for Text.Show states that this holds for derived instances:
The result of sho is a syntactically correct Haskell expression [...]
This fact can be used to pass states of certain computations around as text files or similar objects, because show converts the data to a string representation, while show's complement read enables continuing the earlier computations.
Of course the above is only possible if show and read are either derived automatically or implemented appropriately.
Now my question is: Should the statement from the beginning hold for any show function? Or, more general, Should a law / laws regarding the behaviour of read and show be prescribed, similar to Monad and Monoid?
A trivial law would be:read . show == id
This would have to hold for any x which is part of Read and Show.
This question originated from StackOverflow.submitted by ThreeFx
[link] [11 comments]
After 10 years I finally got around to re-decorating my website. One reason was ICFP, where just too many people told me that I don’t look like on my old website any more (which is very true). Another reason was that I was visting my brother, who is very good at web design (check out his portfolio), who could help me a bit.
I wanted something practical and maybe a bit staid, so I drew inspiration from typical Latex typography, and also from Edward Z. Yang’s blog: A serif font (Utopia) for the main body, justified and hyphenated text. Large section headers in a knobbly bold sans-serif font (Latin Modern Sans, which reasonably resembles Computer Modern). To intensify that impression, I put the main text on a white box that lies – like a paper – on the background. As a special gimmic the per-page navigation (or, in the case of the blog, the list of categories) is marked up like a figure in a paper.
Of course this would be very dire without a suitable background. I really like the procedural art by Jared Tarbell, espcially substrate and interAggregate. Both have been turned into screensavers shipped with xscreensaver, so I hacked the substrate code to generate a seamless tile and took a screenshot of the result. I could not make up my mind yet how dense it has to be to look good, so I for every page I randomly pick one of six variants randomly for now.
I simplified the navigation a bit. The old News section has been removed recently already. The Links section is gone – I guess link lists on homepages are so 90s. The section Contact and About me are merged and awaiting some cleanup. The link to the satire news Heisse News is demoted to a mention on the Contents section.
This hopefully helps to make the site navigatable on mobile devices (the old homepage was unusable). CSS media queries adjust the layout slightly on narrow screens, and separately for print devices.
Since the last release of ghc-heap-view, which was compatible with GHC-7.6, I got 8 requests for a GHC-7.8 compatible version. I started working on it in January, but got stuck and then kept putting it off.
Today, I got the ninths request, and I did not want to wait for the tenth, so I finally finished the work and you can use the new ghc-heap-view-0.5.2 with GHC-7.8.
I used this chance to migrate its source repository from Darcs to git (mirrored on GitHub), so maybe this means that when 7.10 comes out, the requests to update it come with working patches :-). I also added a small test script so that travis can check it:
I did not test it very thoroughly yet. In particular, I did not test whether ghc-vis works as expected.
I still think that the low-level interface that ghc-heap-view creates using custom Cmm code should move into GHC itself, so that it does not break that easily, but I still did not get around to propose a patch for that.
In January I announced my implementation of weight-biased leftist heaps verified with dependent types in Agda. This was part of my work on a paper submitted to CAV’14 conference. The paper got rejected and I decided not to resubmit it anywhere else. At this year’s ICFP listening to Stephanie Weirich’s keynote speech motivated me to finally port that implementation to Haskell, something that I had planned for a couple of months now. You can take a look at the result on github. Here I want to share some of my experiences and insights.
My overall impression is that porting from Agda to Haskell turned out to be fairly straightforward. It was definitely not a complete rewrite. More like syntax adjustments here and there. There were of course some surprises and bumps along the way but nothing too problematic. More precise details are given in the code comments.Agda beats Haskell
When it comes to programming with dependent types Agda, being a fully-fledged dependently-typed language, beats Haskell in many aspects:
- Agda has the same language for terms and types. Haskell separates these languages, which means that if I want to have addition for natural numbers then I need to have two separate definitions for terms and types. Moreover, to tie types and terms together I need singleton types. And once I have singleton types then I need to write third definition of addition that works on singletons. All of this is troublesome to write and use. (This tedious process can be automated by using singletons package.)
- interactive agda-mode for Emacs makes writing code much simpler in Agda. Here I was porting code that was already written so having an interactive Emacs mode for Haskell was not at all important. But if I were to write all that dependently-typed code from scratch in Haskell this would be painful. We definitely need better tools for dependently-typed programming in Haskell.
- Agda admits Unicode identifiers. This allows me to have type constructors like ≥ or variables like p≥b. In Haskell I have GEq and pgeb, respectively. I find that less readable. (This is very subjective.)
- Agda has implicit arguments that can be deduced from types. Haskell does not, which makes some function calls more difficult. Surprisingly that was not as huge problem as I initially thought it will be.
- Agda is total, while Haskell is not. Since there are bottoms in Haskell it is not sound as a logic. In other words we can prove false eg. by using undefined.
The list is noticeably shorter:
- Haskell has much better term-level syntax. In many places this resulted in significantly shorter code than in Agda.
- Haskell is not total. As stated earlier this has its drawbacks but it also has a good side: we don’t need to struggle with convincing the termination checker that our code does actually terminate. This was painful in Agda since it required using sized types.
- Haskell’s gcastWith function is much better than Agda’s subst. Both these functions allow type-safe casts given the proof that the cast is safe. The difference is that Agda’s subst requires more explicit arguments (as I noted earlier the opposite is usually the case) and restricts the cast to the last type parameter (Haskell allows cast for any type parameter).
While the list of wins is longer for Agda than it is for Haskell I’m actually very happy with Haskell’s performance in this task. The verification in Haskell is as powerful as it is in Agda. No compromises required.
It’s worth remarking that my implementation works with GHC 7.6, so you don’t need the latest fancy type-level features like closed type families. The really essential part are the promoted data types.