Consider this code:(>>?) :: Maybe a -> (a -> Maybe b) -> Maybe b Nothing >>? _ = Nothing Just v >>? f = f v -- file: ch10/PNM.hs parseP5_take2 :: L.ByteString -> Maybe (Greymap, L.ByteString) parseP5_take2 s = matchHeader (L8.pack "P5") s >>? \s -> skipSpace ((), s) >>? (getNat . snd) >>? skipSpace >>? \(width, s) -> getNat s >>? skipSpace >>? \(height, s) -> getNat s >>? \(maxGrey, s) -> getBytes 1 s >>? (getBytes (width * height) . snd) >>? \(bitmap, s) -> Just (Greymap width height maxGrey bitmap, s) skipSpace :: (a, L.ByteString) -> Maybe (a, L.ByteString) skipSpace (a, s) = Just (a, L8.dropWhile isSpace s)
I'm quite surprised that width, height and maxGrey are visible outside of the lambda functions. Is this normal behavior in Haskell or there's something I'm missing?submitted by Kiuhnm
[link] [4 comments]
I have a value of type a and a function that maps any value of that type into a list of that same type a -> [a].
This is the two most generic values that I need to have a tree-like structure (at least that I can think of), e.g.:data NTree a = Node a [NTree a] | Leaf a tree :: NTree Int tree = ... children :: NTree a -> [NTree a] children (Leaf _) =  children (Node _ a) = a
The pair (tree, children) is the tree-like structure that I'm trying to abstract over.
The function I want to create is to "flat" the entire tree into a single list. I think this would do it:flatTree :: NTree a -> [NTree a] flatTree (Leaf a) = [(Leaf a)] flatTree (Node children) = concatMap flatTree children
Or, with the generic pair:flatTree :: (a -> [a]) -> a -> [a] flatTree f x = if null children then [x] else x:xs where children = f x xs = concatMap (flatTree f) children
Is there a standard function for flatTree? If it is not in the libraries, does it have a standard name? Or should I stick with my flatTree function?submitted by evohunz
[link] [14 comments]
Hey all. I'm working on a draft of a document expressing my thoughts on using Haskell in industry. I'd like to see what other people think of this. Do you have any additional notes you think are important? Or do you think that anything I've claimed is false? Any other suggestions, comments, improvements, questions, etc, are all appreciated.
Recently, there has been some discussion about the place Haskell has in industry. I'd like to summarize my opinions on the matter – why Haskell is a good choice and why it is potentially a dangerous one.The Case Against Haskell
There's a reason Haskell adoption has been slow in industry, and it's the most important aspect to the case against Haskell: It's not one of the currently popular languages.
Although that may sound trite, language popularity has a number of side effects which make it a significant concern when it comes to building a stack using a particular language.
Library Availability: The more popular a language, the more likely it is that esoteric libraries for it have been written. For example, even though it's a total hack, Spark supports Python; the same sort of interface is possible in Haskell, but no one has yet made it, because Haskell isn't as popular as Python. (There have been discussions on the mailing lists, but no one has yet put in the engineering time to get it to happen.)
Community Support: It's hard to measure community support for languages, but it's likely that if you were to count the number of StackOverflow posts about Python, Django, Ruby, or Rails, they would outnumber the posts about Haskell or Yesod or Snap by a significant amount. My guess would be 10x.
Hiring: Just due to their popularity, many more people will know C, C++, Java, Python, Ruby, etc, than will know Haskell. As a result, focusing on hiring experienced Haskellers would significantly restrict the people you can hire.
Learning Curve: Haskell has a fairly high learning curve. This comes from the fact that it is a fairly unusual breed of functional programming, with a somewhat academic origins. Many people claim that learning Haskell is like learning to program from scratch; although these people are wrong, there is a grain of truth to what they are saying: someone who has never programmed with functional concepts or in a strong static type system may have a lot of difficulty with Haskell.
The other issue that comes to mind is one that does not relate to language popularity, but instead has to do with the language itself.
Laziness: Haskell is a lazy language, meaning that values are not evaluated until they are needed. Although this has a whole host of benefits, one of the issues with lazy evaluation is that it can lead to difficult-to-analyze memory usage. As a result, Haskell programmers sometimes need to debug space leaks, which are issues where unintended laziness leads to excessive memory consumption. This does not come up incredibly often and these issues can be avoided, but nonetheless it is a class of defect that rarely occurs in other languages.The Case For Haskell
Now that I've laid out some of the problems with using Haskell in industry, I'd like to point out the benefits, and why I think there is a large class of problems for which Haskell is a great choice. Some of these comments are parallel to the case against Haskell.
Library Availability: Although Haskell is not as popular as many languages, it has a fairly good database of packages, called Hackage. Hackage has about 9000 packages, while PyPI (Python's package database) has about 64,000 (a factor of about 7x). It's not quite as extensive as other languages, but it's far from nonexistent, and frequently covers even fairly rare niches. The package database is growing very quickly, as well!
Hiring: There is an undersupply of Haskell jobs and an oversupply of people who would like to use Haskell in industry. There are many incredibly skilled people who really like Haskell, but do not have the opportunity to use it in industry. For these people, a job opportunity involving Haskell may be a compelling reason to switch employment or take a position for a lower salary, or may in general engage Haskell programmers who would otherwise pass the opportunity by. Quoting from Quora :"Another thing people talk about is finding Haskell programmers. In practice, this is actually reversed! Startups using Haskell report that it's easier to hire Haskellers if you control for quality. You might get fewer applicants, but they're be uniformly higher in quality. It's the modern version of the Python paradox."
Learning Curve: Haskell does indeed have a learning curve, but it's a learning curve that teaches valuable skills to practitioners. The hardest things involved in learning Haskell come from practices and requiements such as pure functions, separation of pure and effectful code, and high levels of abstraction, all of which are incredibly useful for a high-quality codebase. Learning Haskell forces certain habits which are good in everyday programming regardless of language. In addition, companies using Haskell in industry report that training new Haskellers is actually not as hard as some believe :"It's also not nearly as difficult to train people to use Haskell as one would believe; IMVU mentions this in their great post about What it's like to use Haskell. And if IMVU could conver their PHP programmers to Haskell, you can probably train yours just as easily!" 
Refactoring: Writing Haskell pays off, especially when it comes to the maintainability of your code base. Because of a strong type system, you can safely make very global changes in your codebase. Properties of the language make refactoring more systematic and easier, for example . The compiler will make sure that any changes you make are reflected throughout your entire code base . The result of this is that we can safely iterate quickly on the entire design of the system; instead of being stuck with design decisions you make early on, you can change your design as your business requirements shift. Technical debt is much easier to pay off, resulting in more robust systems over time.
Testing: Purity, lazy evaluation, and the level of abstraction that Haskell provides makes testing incredibly easy. State of the art testing libraries such as QuickCheck and SmallCheck are often pioneered in Haskell, and later ported to languages such as Scala and Python. In addition, the strong and expressive static type system means that many things you must test for in other languages do not need tests in Haskell. For example, red-black trees in Haskell can be written such that the type system guarantees your color and balance invariants hold , and as a result, you don't need to write tests that check for your trees being balanced, because the type system proves that property for you! As a result, you can engineer very stable and well-tested systems with less developer effort than you can in some other languages.
Speed and Parallelism: The main Haskell compiler, GHC, is a state of the art masterpiece. It has an LLVM and a native code backend and it can generate incredibly performant code. With fairly straightforward optimizations, Haskell code can run with code fairly close to C code. The IO manager and runtime can handle thousands of threads easily, and the pure functional nature makes it easy to parallelize your algorithms with incredible ease. Haskell can be made very high-performance and parallel with minimal code changes.
Code Re-Use: The expressive type system and high level of abstraction commonly used in Haskell code mean that it permits a very high level of code reuse . Patterns are abstracted into reusable components and then high-level interfaces to complex patterns are built into libraries (see lens, pipes, conduit, free, FRP). Reducing code duplication means developers can focus on building systems and designing algorithms rather than debugging the same code and patterns over and over again.Experimenting with Haskell
I would recommend moving forward with Haskell as an experimental investment; if initial experiences are good, invest in it more heavily, but if you find that it is causing difficulties, switch to a more traditional language without the disadvantages Haskell presents. I believe that the advantages brought about by high-quality software engineering in Haskell far outweigh the detriments, especially in the long run.
Based on what has worked for others , experimenting with a new technology can be done by starting out small: first, replace an existing service with a well-defined protocol with one written in Haskell (or write a new non-critical service you need in Haskell). If you're satisfied with the result, hire someone experienced in Haskell to maintain and develop this service, as well as potentially others. If this goes well, continue with your commitment to the language and ecosystem; if it turns out that it's not a fit for the team or the business, stop, and if necessary rewrite the service in a more conventional language. (This is why it is important to keep that service small, isolated, and non-critical.) In the best case, you will have discovered a secret weapon for your company that allows you to move faster than your competitors and attract more better talent; in the worst case, you will have wasted a few days of company time on rewriting a small service (which will be better than it was originally anyway, due to the experience gained when writing it the first time).NiftyIon
[link] [115 comments]
Summary: I wrote a Conduit combinator which makes the upstream and downstream run in parallel. It makes Hoogle database generation faster.
The Hoogle database generation parses lines one-by-one using haskell-src-exts, and then encodes each line and writes it to a file. Using Conduit, that ends up being roughly:parse =$= write
Conduit ensures that parsing and writing are interleaved, so each line is parsed and written before the next is parsed - ensuring minimal space usage. Recently the FP Complete guys profiled Hoogle database generation and found each of these pieces takes roughly the same amount of time, and together are the bottleneck. Therefore, it seems likely that if we could parse the next line while writing the previous line we should be able to speed up database generation. I think of this as analogous to CPU pipelining, where the next instruction is decoded while the current one is executed.
I came up with the combinator:pipelineC :: Int -> Consumer o IO r -> Consumer o IO r
Allowing us to write:parse =$= pipelineC 10 write
Given a buffer size 10 (the maximum number of elements in memory simultaneously), and a Consumer (write), produce a new Consumer which is roughly the same but runs in parallel to its upstream (parse).The Result
When using 2 threads the Hoogle 5 database creation drops from 45s to 30s. The CPU usage during the pipelined stage hovers between 180% and 200%, suggesting the stages are quite well balanced (as the profile suggested). The parsing stage is currently a little slower than the writing, so a buffer of 10 is plenty - increasing the buffer makes no meaningful difference. The reason the drop in total time is only by 33% is that the non-pipelined steps (parsing Cabal files, writing summary information) take about 12s.
Note that Hoogle 5 remains unreleased, but can be tested from the git repo and will hopefully be ready soon.The Code
The idea is to run the Consumer on a separate thread, and on the main thread keep pulling elements (using await) and pass them to the other thread, without blocking the upstream yield. The only tricky bit is what to do with exceptions. If the consumer thread throws an exception we have to get that back to the main thread so it can be dealt with normally. Fortunately async exceptions fit the bill perfectly. The full code is:pipelineC :: Int -> Consumer o IO r -> Consumer o IO r
pipelineC buffer sink = do
sem <- liftIO $ newQSem buffer -- how many are in flow, to avoid excess memory
chan <- liftIO newChan -- the items in flow (type o)
bar <- liftIO newBarrier -- the result type (type r)
me <- liftIO myThreadId
liftIO $ flip forkFinally (either (throwTo me) (signalBarrier bar)) $ do
(whileM $ do
x <- liftIO $ readChan chan
liftIO $ signalQSem sem
whenJust x yield
return $ isJust x) =$=
awaitForever $ \x -> liftIO $ do
writeChan chan $ Just x
liftIO $ writeChan chan Nothing
liftIO $ waitBarrier bar
We are using a channel chan to move elements from producer to consumer, a quantity semaphore sem to limit the number of items in the channel, and a barrier bar to store the return result (see about the barrier type). On the consumer thread we read from the channel and yield to the consumer. On the main thread we awaitForever and write to the channel. At the end we move the result back from the consumer thread to the main thread. The full implementation is in the repo.Enhancements
I have specialised pipelineC for Consumers that run in the IO monad. Since the Consumer can do IO, and the order of that IO has changed, it isn't exactly equivalent - but relying on such IO timing seems to break the spirit of Conduit anyway. I suspect pipelineC is applicable in some other moands, but am not sure which (ReaderT and ResourceT seem plausible, StateT seems less likely).
Acknowledgements: Thanks to Tom Ellis for helping figure out what type pipelineC should have.
So I get the following error on Travis-ci:
cabal: You need to re-run the 'configure' command. The version of Cabal being used has changed (was Cabal-188.8.131.52, now Cabal-184.108.40.206).
This makes zero sense. I start off clean and install a specific version of Cabal install, as well as GHC. I do this with various versions of GHC in a Travis matrix.
What's even stranger is that before the release of GHC 7.10.2 yesterday, this behaviour used to crop up on 7.10.2 but 7.10.1 worked just fine. Now that 7.10.2 has been officially released the two have effectively swapped around!
Answers online are pretty obvious: your version of Cabal-install has changed and you need to run configure again and/or delete some folders in your project. But my Cabal-install version has NOT changed, at least I haven't done anything to deliberately change it.
Any clues?submitted by BoteboTsebo
[link] [10 comments]
Stack was recently introduced as an alternative to Cabal, if I understand correctly, and is designed to be better but also backwards compatible. It has seen wide adoption considering that it has only been out for a few months. Are there plans to replace Cabal with Stack?submitted by bitmadness
[link] [65 comments]
The type of fmap isfmap :: (a -> b) -> f a -> f b.
Just looking at this, I don't quite understand how this applies to lists of things. I understand how this works for something like Maybe a. I think of it like this:fmap g (f a) = f (g a)
I just can't see how this same thought works with lists. My not understanding this doesn't stop me from using fmap on lists, but I think it would be really nice to understand the tools I'm using.
Thanks!submitted by mrg58
[link] [14 comments]
Warning: quibbling about style is likely to follow. Also, it is possible that I am not being entirely serious - I haven't decided yet :)
One quirk of writing lens-heavy code is that if you are not disciplined you make readers dizzy from having to read lines in two directions (composition of lenses and (&) vs. composition of plain functions and ($)). However, there is a way to exploit this bidirectionality that (arguably, supposedly, questionably, etc.) might actually improve readability. The trick is using (&) to separate the interesting parts of expressions (i.e. those that are semantically rich, domain-specific, etc.) from the uninteresting ones (i.e. lifting, type juggling, error message insertion, etc.). Here is a sample do-block:fetchGlubs :: MonadIO m => ExceptT String m [Glub] fetchGlubs = do txt <- extractTextFromQuux <$> fetchQuuxViaAPI & ExceptT catMaybes . fmap destroyInconsistentGlubs <$> parse Parser.glubs "fetchGlubs" txt & (first show >>> hoistEither)
Note the composition is right-to-left in the interesting parts (because it is what we are more used to) and left-to-right in the uninteresting ones (for consistency with (&), and also to allow using the (&) as a fork when reading the line). There is even a mnemonic for using (&) in this way: "Take this interesting expression and then do this other boring stuff to it".
Have I just devised a stylistic abomination? Or it might actually be useful in some contexts?submitted by duplode
[link] [20 comments]
I just wanted to post a quick reminder that you still have a few days to submit a talk for this year's Haskell Implementors' Workshop.
So, um, yeah, you should go and do that.submitted by edwardkmett
I'm an intermediate haskeller and I'm starting a new project at work (to be used internally, so not concerned with security overmuch). I've sketched out a frontend in Reflex, served by Yesod, but I'm looking for advice on what options to consider for the rest of the backend. Multiple (computer, not human) clients will be pushing messages to the server, which will in turn store them and simultaneously push the received messages to the browser (over websockets) for display and historical querying.
The simplest option I can think of is JSON + HTTP POST from the clients, as the clients are in other languages (Python, C, ...). For something better than JSON, I'm considering thrift. But there's a constellation of other message description formats out there; some with community haskell support, like protobuf2 and msgpack, and nicer ones (currently, to my knowledge) without support - e.g., protobuf3, microsoft bond, flatbuffers, capnproto.
A REST API is also not something I'm thrilled with as a poor person's message queue. Again, there are many options in the Haskell space with varying levels of stability and support - kafka, nanomsg, zeromq, RabbitMQ, etc. I don't need something as complex as RabbitMQ for my needs (no federation or complex routing topologies) so what are the community's preferences/experiences in Haskell? Kafka could solve my needs for message persistence, though it's a little more complex to configure than the broker-less options.
Lastly, I could skip a messagebus entirely - clients could connect directly to a redis or rethinkdb backend for instance (as above, removing the need for me to persist messages), and my server could simply poll for new data. (I assume I'll need a server since I can't compile rethinkdb-haskell with ghcjs for the browser, given the dependency on network... - again, I'm ignoring security concerns). If I don't do this, I'll probably refer to the many past reddit conversations to choose my persistence layer - Yesod's batteries-included persistent or eseuqleto are probably fine for my needs (e.g., https://www.reddit.com/r/haskell/comments/33k8zx/what_databases_are_most_haskellers_using/).
With my prioritization of simplicity (for both the clients and for myself) over other non-needs (authentication, high scalability, etc) are there any suggestions out of the above, or recommendations on solutions I'm not aware of/haven't mentioned? Thanks!submitted by fmapthrowaway
[link] [13 comments]
Welcome for the latest entry in the GHC Weekly News. Today GHC HQ met to discuss plans post-7.10.2.GHC 7.10.2 release
GHC 7.10.2 has been released!
As always, if you suspect that you have found a regression don't hesitate to open a Trac ticket. We are especially interested in performance regressions with fairly minimal reproduction cases.GHC 7.10.2 and the text package
A few days ago a report came in of long compilations times under 7.10.2 on a program with many Text literals (#10528). This ended up being due to a change in the simplifier which caused it to perform rule rewrites on the left-hand-side of other rules. While this is questionable (read "buggy") behavior, it doesn't typically cause trouble so long as rules are properly annotated with phase control numbers to ensure they are performed in the correct order. Unfortunately, it turns out that the rules provided by the text package for efficiently handling string literals did not include phase control annotations. This resulted in a rule from base being performed on the literal rules, which rendered the literal rules ineffective. The simplifier would then expend a great deal of effort trying to simplify the rather complex terms that remained.
Thankfully, the fix is quite straightforward: ensure that the the text literal rules fire in the first simplifier phase (phase 2). This avoids interference from the base rules, allowing them to fire as expected.
This fix is now present in text-220.127.116.11. Users of GHC 7.10.2 should be use this release if at all possible. Thanks to text's maintainer, Bryan O'Sullivan for taking time out of his vacation to help me get this new release out.
While this mis-behaviour was triggered by a bug in GHC, a similar outcome could have arisen even without this bug. This highlights the importance of including phase control annotations on INLINE and RULE pragmas: Without them the compiler may choose the rewrite in an order that you did not anticipate. This has also drawn attention to a few shortcomings in the current rewrite rule mechanism, which lacks the expressiveness to encode complex ordering relationships between rules. This limitation pops up in a number of places, including when trying to write rules on class-overloaded functions. Simon Peyton Jones is currently pondering possible solutions to this on #10595.StrictData
This week we merged the long-anticipated -XStrictData extension (Phab:D1033) by Adam Sandberg Ericsson. This implements a subset of the [StrictPragma] proposal initiated by Johan Tibell.In particular, StrictData allows a user to specify that datatype fields should be strict-by-default on a per-module basis, greatly reducing the syntactic noise introduced by this common pattern. In addition to implementing a useful feature, the patch ended up being a nice clean-up of the GHC's handling of strictness annotations.
What remains of this proposal is the more strong -XStrict extension which essentially makes all bindings strict-by-default. Adam has indicated that he may take up this work later this summer.AMP-related performance regression
In late May Herbert Valerio Riedel opened Phab:D924, which removed an explicit definition for mapM in the  Traversable instance, as well as redefined mapM_ in terms of traverse_ to bring consistency with the post-AMP world. The patch remains unmerged, however, due to a failing ghci testcase. It turns out the regression is due to the redefinition of mapM_, which uses (*>) where (>>) was once used. This tickles poor behavior in ghci's ByteCodeAsm module. The problem can be resolved by defining (*>) = (>>) in the Applicative Assembler instance (e.g. Phab:1097). That being said, the fact that this change has already exposed performance regressions raises doubts as to whether it is prudent.GHC Performance work
Over the last month or so I have been working on nailing down a variety of performance issues in GHC and the code it produces. This has resulted in a number of patches which in some cases dramatically improve compilation time (namely Phab:1012 and Phab:D1041). Now since 7.10.2 is out I'll again be spending most of my time on these issues. We have heard a number of reports that GHC 7.10 has regressed on real-world programs. If you have a reproducible performance regression that you would like to see addressed please open a Trac ticket.Merged patches
- Phab:D1028: Fixity declarations are now allowed for infix data constructors in GHCi (thanks to Thomas Miedema)
- Phab:D1061: Fix a long-standing correctness issue arising when pattern matching on floating point values
- Phab:D1085: Allow programs to run in environments lacking iconv (thanks to Reid Barton)
- Phab:D1094: Improve code generation in integer-gmp (thanks to Reid Barton)
- Phab:D1068: Implement support for the MO_U_Mul2 MachOp in the LLVM backend (thanks to Michael Terepeta)
- Phab:D524: Improve runtime system allocator performance with two-step allocation (thanks to Simon Marlow)
That's all for this time. Enjoy your week!