News aggregator

Michael Snoyman: Spreading the Gospel of Haskell

Planet Haskell - Mon, 11/21/2016 - 6:00pm

Yesterday I fired off two tweets about the state of Haskell evangelism:

I'd hoped by now we'd be out spreading the gospel of Haskell's awesome features. Instead, we're fighting about the same broken things.

— Michael Snoyman (@snoyberg) November 21, 2016

Haskell is by far the best language on the market today. It's so sad to see it not gaining traction because of unimportant details.

— Michael Snoyman (@snoyberg) November 21, 2016 <script async="async" charset="utf-8" src="http://platform.twitter.com/widgets.js"></script>

But simply complaining about the state of things, instead of actually proposing a way to make things better, is a waste of 280 characters. So I'd like to expand on where I think we, the Haskell community, can do better.

To try and prevent the flamewars which I'm sure are about to start, let me give some guidelines on who shouldn't read this blog post:

  • If you think that programming languages should succeed on purely technical merits, and silly "marketing activities" like writing newbie-oriented tutorials and making engaging screencasts is unfair competition, you shouldn't read this blog post.
  • If you think that broken dependency solving, Hackage downtime, confusing cabal-install incantations, and problematic global package databases in the Haskell Platform have had no ill effect on Haskell adoption, you shouldn't read this blog post.
  • If you think that new users who give up after 5 minutes of confusion on a website weren't serious enough in the first place and we shouldn't be sad that we lost them, you shouldn't read this blog post.

And most likely, you shouldn't post this to /r/haskell. That's not going to end well.

Attacking Haskell's Flaws

As the Twitter discussion yesterday pointed out, there are undoubtedly flaws in Haskell. It may be inflammatory to admit that publicly, but so be it. Every language has flaws. Haskell is blessed to also have some of the greatest strengths in any programming language available today: beautiful concurrency, a powerful and useful type system, a plethora of real-world libraries, and (as of recently) pretty good tooling and educational resources.

At FP Complete, we often talk about the attractors and obstacles (thanks to our CEO, Aaron Contorer, for this great prism to view things). Using that terminology: Haskell is chock-full of attractors. The problem is the obstacles which prevent Haskell from taking off. I'm going to claim that, at this point, we need to do very little as far as making Haskell more attractive, but instead need to collectively knock down obstacles preventing its success.

Obstacles can be a great many things, some of which you may have categorized as "missing attractors." Let me give some examples:

  • Missing IDE tooling. For some people, this is a deal-breaker, and will prevent them from using Haskell.
  • A missing library. Again, if someone needs to access, say, MS SQL Server, and a library doesn't exist, this is an obstacle to adoption. (Yes, that person could go ahead and write the library him/herself. If you think that's the right response, you probably shouldn't be reading this blog post.)
  • Lack of a tutorial/example/cookbook for a specific problem domain. Yes, someone could struggle through reading API docs until "it clicks." If that's your answer: also probably shouldn't be reading this post.
  • Lack of support for an OS/architecture.

The important thing about obstacles is that they are not universal. For most of us, lack of support for Haiku OS will not prevent us from using Haskell. Those of us who have been using Haskell for years have decided that the obstacles of bad tooling weren't enough to deter us from the great language features. And so on.

Prioritizing

Many people in the Haskell community have been chipping away at random obstacles (or adding random attractors) for years now, on a hobbyist basis. If that's all you want to do, more power to you, and enjoy. What I'm doing here is making a call for a more concerted, organized effort into knocking down these obstacles to Haskell adoption.

I'd say that we can measure how high a priority an obstacle-destroying action is based on two criteria:

  • How difficult it will be to accomplish
  • How big an impact it will have on Haskell adoption

I would call easy actions with big impact low hanging fruit, and recommend we focus on those actions for now. In other words, while improving GHC compile times may have a big impact, it's also difficult to accomplish. Similarly, changing the Haskell logo from purple to blue is easy to accomplish, but doesn't have much impact.

So my set of easy to do and big impact things entirely come down to spreading the word. I would say our biggest and easiest knocked-down obstacles are:

  • Someone's never heard of Haskell
  • Someone's heard of Haskell, but doesn't know why it's relevant
  • Someone's well aware of Haskell, but thinks it will be hard to start with
  • Someone's already tried Haskell and run into problems (like Dependency Hell), and doesn't realize we've solved them

So what does this entail? Here are my suggestions:

  • Write a blog post about how you solved a problem in Haskell
  • Give a talk at a conference on what problems Haskell is particularly good at solving (my money goes to concurrency on this)
  • Put together a screencast on Haskell
  • Encourage a non-Haskeller to go through the Haskell Book and Haskell Syllabus

The intention here to show the world that Haskell is ready to help them, and that it's easy to get started now. Many of us at FP Complete have been putting out such posts for a while. I'm asking others to join in the fun and help give Haskell adoption a kick-start.

One final request: if you've gotten this far, odds are you agree that we need to encourage users to take the most-likely-to-succeed route to Haskell, be that with tooling, training, library installation, or library selection. We've put a lot of effort into making haskell-lang.org the destination for that goal. Hopefully haskell.org can converge on this goal in the future, but for now it's very likely to just present another obstacle. When you tell people to get started with Haskell, I strongly recommend linking to https://haskell-lang.org/get-started.

Categories: Offsite Blogs

FP Complete: Mastering Time-to-Market with Haskell

Planet Haskell - Sun, 11/20/2016 - 8:00pm

For bringing your product to market, there isn't just one metric for success. That depends on your business needs. Haskell is the pure functional programming language that brings tangible benefits over its competitors for a variety of TTM priorities. Let's explore four of them.

Speed to market

You may want to bring your product to market as quickly as possible, because you have direct competitors; you have to produce a functioning demo for your investors; or your success in the market is time sensitive. Haskell speeds up this process in various ways.

Constuct things correctly the first time: Haskell is statically typed, which is a form of program verification that guarantees correctness of certain properties, like in Java or C#. Unlike Java or C#, Haskell is a pure functional language, leading to verification of far more portions of the program source code. With feedback from the compiler while developing, your developers are guaranteed a certain level of correctness and this allows them to concentrate on your domain business logic. As written elsewhere, worst practices should be hard. Also, a case study on Haskell vs C# for contract writing.

Reduce testing time: Haskell emphasizes the correct by construction approach, which is to use the type system to verify that program parts are combined in only the ways that will not crash and that make sense. Examples range from the simple to advanced. For example, Haskell web frameworks like Yesod prevent XSS and accidental broken links statically. The more surface area of your problem covered by static analysis, the less time and effort is needed by your developers for writing unit and integration tests, and the tests blow up less often in continuous integration.

Make builds reproducible: Haskell projects that are built with the Stack build tool, are guaranteed reproducible builds using a stable set of packages, with Stack also providing docker support out of the box, adding an additional layer of reproducibility. If a reproducible build is created on one developer's machine, it will work on any. This significantly reduces ramp up time for your developers, and makes continuous integration trivial for your devops people.

Use the concurrency: Many problems are solved more easily with concurrency (networking, file I/O, video/image/document processing, database access, etc.) simply because the programming model is easier to understand. Haskell has some of the best concurrency support of any popular language, it has a breadth of tools, efficiency, stability, and it is trivial to use, out of the box. Let your developers use it. See Beautiful concurrency for more about concurrency in Haskell. Additionally, the code doesn't have to be rewritten in an arcane style like in NodeJS to gain concurrency.

Shipping on schedule

You may not need to ship as soon as possible, but to ship on schedule, for a demo, a conference or as promised to your investors or customers. For this, there is another mitigating feature of Haskell.

Types shed light on scope: Using the type system of Haskell to model your domain logic helps to expand the "fog of war" that we experience with project scope: there are many unknowns, and we need to enumerate all the cases and kinds of things. Armed with this UML-without-the-UML, you can have confidence in how much scope you can cover now for the current shipping schedule, and what needs to come in version 2. This gives confidence in time estimates made by your developers. See Haskell-Providing Peace of Mind for a case-study. See Using Haskell at SQream Technologies for a technical demonstration.

Avoid build system detours: Reproducible builds help to avoid the inevitable build system detours that happen on large projects, where developers are wasting time getting the build system to work on eachother's machines and debugging failures. Stack is reproducible out of the box.

Minimizing resources

You might want or need to be cost-effective in your development cycle, using as few developers or machines as possible. Haskell also reduces development costs.

Less testing is required: Haskell software requires less testing. Your developers always write tests, but with Haskell they can write fewer, and spend less time on it. Furthermore, fewer developers are needed to achieve a stable system, because a type system helps limit scope of possibilities, and lets your developers manage complexity.

Use Haskell's raw performance: Additionally, Haskell is a fast language on a single core. It is compiled to native machine code, and is competitive with C# and Java. At times it is competitive with C, see Haskell from C: Where are the for loops? for a technical demonstration. This means that you need fewer machine resources, and fewer machines. Haskell is also easy to write concurrent code for, which means you can make use of those additional cores on your machines.

Flexbility to make changes

Most developments put a high priority in flexibility to change, or should. But you may have particularly pronounced need for flexibility in changes to requirements without disruption. This is perhaps Haskell's strongest, most desirable advantage.

Correct by construction extends to reconstruction: Making a change to a well-typed system in Haskell is far more stable and reliable than its competitors because correctness invariants are maintained with any change of the code. See also this case study by Silk.

Less maintenance of the test suite under change: This requires less maintenance of the test suite, because static analysis gives your developers immediate feedback at a granular level, whereas a test suite typically does not guide them through the change process, it only tells them what their out-of-date specification expects. Once the change process is complete, updating the test suite becomes easier.

Types are rarely buggy: It's very rare to design a data type that has bugs in it, because they are so simple. Meanwhile a unit test or integration test suite presents additional developer overhead because it itself is a program that requires maintenance too.

See this Bump case study, for why teams are choosing Haskell over Python, and here for the comparison against Ruby of a similar nature: Haskell yields fewer errors and bugs.

Expanding development effort

You may find that your project requires hiring new developers and building a team. Now is the perfect time to hire Haskell developers. Like Python developers 10 years ago, Haskell developers are self-selecting; they learn it because it's a better language, not because it will guarantee them employment. At the same time, at this stage in Haskell's development, the wealth of practical, stable packages available indicate an infusion of pragmatic, experienced programmers.

Summary and further reading

In summary we've seen that:

  • Haskell decreases development time by narrowing the scope of development to your domain.
  • Haskell decreases the need and dependency on unit testing alone.
  • Haskell aids in reproducibility which helps teams work together, expand, and deploy.
  • Haskell's speed and easy concurrency reduce the cost of development substantially.
  • Haskell helps your developers evolve a system over time safely and with confidence, yielding more reliable time estimates.

You can learn more about using Haskell as a business at FP Complete's home page, in particular the Consulting page, or go and contact us straight away and we'll be in touch.

Categories: Offsite Blogs

Brent Yorgey: MonadRandom 0.5 and mwc-random: feedback wanted

Planet Haskell - Wed, 11/16/2016 - 5:36am

Since 2013 or so I have been the maintainer of the MonadRandom package, which provides an mtl-style type class for monads with support for generation of pseudorandom values, along with a concrete random monad transformer RandT. As of this writing it has 89 reverse dependencies on Hackage—a healthy number, and one that makes me think carefully about any breaking changes to the package.

Recently I got a number of pull requests, and have been working on putting together an 0.5 release which adds a few functions, adds lazy- and strict-state variants of RandT, and reorganizes things to be closer to standard practice of the transformers package. Since this release will include some technically breaking changes already, it’s a good time to think about potentially including others.

The one thing I am not sure what to do about is this issue: Allow MonadRandom interface for MWC-random. mwc-random is a very nice package for psuedorandom number generation, but apparently it does not fit into the MonadRandom abstraction. First of all, I would like to understand why—I am not very familiar with mwc-random. Second of all, I’d love to figure out a solution, but ideally one that causes as little breakage to existing code as possible.

Leave a comment (either here or on the github issue) if this is something you know/care about, and let’s see if we can figure out a good solution together!


Categories: Offsite Blogs

Michael Snoyman: Haskell's Missing Concurrency Basics

Planet Haskell - Tue, 11/15/2016 - 6:00pm

I want to discuss two limitations in standard Haskell libraries around concurrency, and discuss methods of improving the status quo. Overall, Haskell's concurrency story is - in my opinion - the best in class versus any other language I'm aware of, at least for the single-machine use case. The following are two issues that I run into fairly regularly and are a surprising wart:

  • putStrLn is not thread-safe
  • Channels cannot be closed

Let me back up these claims, and then ask for some feedback on how to solve them.

putStrLn is not thread-safe

The example below is, in my opinion, a prime example of beautiful concurrency in Haskell:

#!/usr/bin/env stack -- stack --resolver lts-6.23 --install-ghc runghc --package async import Control.Concurrent.Async import Control.Monad (replicateM_) worker :: Int -> IO () worker num = replicateM_ 5 $ putStrLn $ "Hi, I'm worker #" ++ show num main :: IO () main = do mapConcurrently worker [1..5] return ()

Well, it's beautiful until you see the (abridged) output:

Hi, HIiH'HH,imii , ,,I w 'IoIIm'r'' mkmmw e owrwwro ookr#rrek2kkre ee rrr# H 3#i## 4,51

Your mileage may vary of course. The issue here is that Prelude.putStrLn works on String, which is a lazy list of Chars, and in fact sends one character at a time to stdout. This is clearly not what we want. However, at the same time, many Haskellers - myself included - consider String-based I/O a bad choice anyway. So let's replace this with Text-based I/O:

#!/usr/bin/env stack -- stack --resolver lts-6.23 --install-ghc runghc --package async --package text {-# LANGUAGE OverloadedStrings #-} import Control.Concurrent.Async import Control.Monad (replicateM_) import qualified Data.Text as T import qualified Data.Text.IO as T worker :: Int -> IO () worker num = replicateM_ 5 $ T.putStrLn $ T.pack $ "Hi, I'm worker #" ++ show num main :: IO () main = do mapConcurrently worker [1..5] return ()

Unfortunately, if you run this (at least via runghc), the results are the same. If you look at the implementation of Data.Text.IO.hPutStr, you'll see that there are different implementations of that function depending on the buffering straregy of the Handle we're writing to. In the case of NoBuffering (which is the default with GHCi and runghc), this will output one character at a time (just like String), whereas LineBuffering and BlockBuffering have batch behavior. You can see this with:

#!/usr/bin/env stack -- stack --resolver lts-6.23 --install-ghc runghc --package async --package text {-# LANGUAGE OverloadedStrings #-} import Control.Concurrent.Async import Control.Monad (replicateM_) import qualified Data.Text as T import qualified Data.Text.IO as T import System.IO worker :: Int -> IO () worker num = replicateM_ 5 $ T.putStrLn $ T.pack $ "Hi, I'm worker #" ++ show num main :: IO () main = do hSetBuffering stdout LineBuffering mapConcurrently worker [1..5] return ()

While better, this still isn't perfect:

Hi, I'm worker #4Hi, I'm worker #5Hi, I'm worker #1 Hi, I'm worker #4Hi, I'm worker #5Hi, I'm worker #1 Hi, I'm worker #4Hi, I'm worker #5

Unfortunately, because newlines are written to stdout separately from the message, these kinds of issues happen too frequently. This can be worked around too by using putStr instead and manually appending a newline character:

#!/usr/bin/env stack -- stack --resolver lts-6.23 --install-ghc runghc --package async --package text {-# LANGUAGE OverloadedStrings #-} import Control.Concurrent.Async import Control.Monad (replicateM_) import qualified Data.Text as T import qualified Data.Text.IO as T import System.IO worker :: Int -> IO () worker num = replicateM_ 5 $ T.putStr $ T.pack $ "Hi, I'm worker #" ++ show num ++ "\n" main :: IO () main = do hSetBuffering stdout LineBuffering mapConcurrently worker [1..5] return ()

Finally, we can avoid the buffering-dependent code in the text package and use ByteString output, which has the advantage of automatically using this append-a-newline logic for small-ish ByteStrings:

#!/usr/bin/env stack -- stack --resolver lts-6.23 --install-ghc runghc --package async {-# LANGUAGE OverloadedStrings #-} import Control.Concurrent.Async import Control.Monad (replicateM_) import qualified Data.ByteString.Char8 as S8 worker :: Int -> IO () worker num = replicateM_ 100 $ S8.putStrLn $ S8.pack $ "Hi, I'm worker #" ++ show num main :: IO () main = do mapConcurrently worker [1..100] return ()

However, this has the downside of assuming a certain character encoding, which may be different from the encoding of the Handle.

What I'd like I would like a function Text -> IO () which - regardless of buffering strategy - appends a newline to the Text value and sends the entire chunk of data to a Handle in a thread-safe manner. Ideally it would account for character encoding (though assuming UTF8 may be an acceptable compromise for most use cases), and it would be OK if very large values are occassionally compromised during output (due to the write system call not accepting the entire chunk at once).

What I'd recommend today In a number of my smaller applications/scripts, I've become accustomed to defining a say = BS.hPutStrLn stdout . encodeUtf8. I'm tempted to add this to a library - possibly even classy-prelude - along with either reimplementing print as print = say . T.pack . show (or providing an alternative to print). I've also considered replacing the putStrLn in classy-prelude with this implementation of say.

However, I'm hoping others have some better thoughts on this, because I don't really find these solutions very appealing.

Non-closable channels

Let's implement a very simple multi-worker application with communication over a Chan:

#!/usr/bin/env stack -- stack --resolver lts-6.23 --install-ghc runghc --package async --package text {-# LANGUAGE OverloadedStrings #-} import Control.Concurrent import Control.Concurrent.Async import Control.Monad (forever) import Data.Text (Text, pack) import Data.Text.Encoding (encodeUtf8) import qualified Data.ByteString.Char8 as S8 say :: Text -> IO () say = S8.putStrLn . encodeUtf8 worker :: Chan Int -> Int -> IO () worker chan num = forever $ do i <- readChan chan say $ pack $ concat [ "Worker #" , show num , " received value " , show i ] main :: IO () main = do chan <- newChan mapConcurrently (worker chan) [1..5] `concurrently` mapM_ (writeChan chan) [1..10] return ()

(Yes, I used the aforementioned say function.)

This looks all well and good, but check out the end of the output:

Worker #5 received value 8 Worker #3 received value 9 Worker #1 received value 10 Main: thread blocked indefinitely in an MVar operation

You see, the worker threads have no way of knowing that there are no more writeChan calls incoming, so they continue to block. The runtime system notes this, and sends them an async exception to kill them. This is a really bad idea for program structure as it can easily lead to deadlocks. Said more simply:

Instead, the workers should have some way of knowing that the channel is closed. This is a common pattern in other languages, and one I think we should borrow. Implementing this with STM isn't too bad actually, and can easily have an IO-based API if desired:

#!/usr/bin/env stack -- stack --resolver lts-6.23 --install-ghc runghc --package async --package text {-# LANGUAGE OverloadedStrings #-} import Control.Applicative ((<|>)) import Control.Concurrent.Async import Control.Concurrent.STM import Data.Text (Text, pack) import Data.Text.Encoding (encodeUtf8) import qualified Data.ByteString.Char8 as S8 say :: Text -> IO () say = S8.putStrLn . encodeUtf8 data TCChan a = TCChan (TChan a) (TVar Bool) newTCChan :: IO (TCChan a) newTCChan = atomically $ TCChan <$> newTChan <*> newTVar False closeTCChan :: TCChan a -> IO () closeTCChan (TCChan _ var) = atomically $ writeTVar var True writeTCChan :: TCChan a -> a -> IO () writeTCChan (TCChan chan var) val = atomically $ do closed <- readTVar var if closed -- Could use nicer exception types, or return a Bool to -- indicate if writing failed then error "Wrote to a closed TCChan" else writeTChan chan val readTCChan :: TCChan a -> IO (Maybe a) readTCChan (TCChan chan var) = atomically $ (Just <$> readTChan chan) <|> (do closed <- readTVar var check closed return Nothing) worker :: TCChan Int -> Int -> IO () worker chan num = loop where loop = do mi <- readTCChan chan case mi of Nothing -> return () Just i -> do say $ pack $ concat [ "Worker #" , show num , " received value " , show i ] loop main :: IO () main = do chan <- newTCChan mapConcurrently (worker chan) [1..5] `concurrently` do mapM_ (writeTCChan chan) [1..10] closeTCChan chan return ()

Fortunately, this problem has a preexisting solution: the stm-chans package, which provides closable and bounded channels and queues. Our problem above can be more easily implemented with:

#!/usr/bin/env stack -- stack --resolver lts-6.23 --install-ghc runghc --package async --package text --package stm-chans {-# LANGUAGE OverloadedStrings #-} import Control.Concurrent.Async import Control.Concurrent.STM import Control.Concurrent.STM.TMQueue import Data.Text (Text, pack) import Data.Text.Encoding (encodeUtf8) import qualified Data.ByteString.Char8 as S8 say :: Text -> IO () say = S8.putStrLn . encodeUtf8 worker :: TMQueue Int -> Int -> IO () worker q num = loop where loop = do mi <- atomically $ readTMQueue q case mi of Nothing -> return () Just i -> do say $ pack $ concat [ "Worker #" , show num , " received value " , show i ] loop main :: IO () main = do q <- newTMQueueIO mapConcurrently (worker q) [1..5] `concurrently` do mapM_ (atomically . writeTMQueue q) [1..10] atomically $ closeTMQueue q return ()

What I'd like The biggest change needed here is just to get knowledge of this very awesome stm-chans package out there more. That could be with blog posts, or even better with links from the stm package itself. A step up from there could be to include this functionality in the stm package itself. Another possible niceity would be to add a non-STM API for these - whether based on STM or MVars internally - for more ease of use. I may take a first step here by simply depending on and reexporting stm-chans from classy-prelude.

What I'd recommend Probably pretty obvious: use stm-chans!

Like the previous point though, I'm interested to see how other people have approached this problem, since I haven't heard it discussed much in the past. Either others haven't run into this issue as frequently as I have, everyone already knows about stm-chans, or there's some other solution people prefer.

Categories: Offsite Blogs

wren gayle romano: Three ineffectual strategies for dealing with trauma and pain

Planet Haskell - Tue, 11/15/2016 - 3:53am

The last week has been challenging for all of us. In the depths of my own fear and uncertainty, I reached for one of my favorite books —Pema Chödrön’s Comfortable with Uncertainty— and opened to a passage at random. On friday, a friend of mine asked how I’ve been able to deal with it all. I told him about the passage, and he (a non-buddhist) found it helpful in dealing with his own pain, so I wanted to share more broadly.

Before getting to the passage, I think it’s important for people to recognize that this pain we are feeling is a collective trauma. This is not our day-to-day pain, not our usual suffering. Everyone develops habits and skills for addressing the typical discomforts of life, but those skills are often inapplicable or ineffective for recovering from truly traumatic events. When someone is in a car wreck, or attacked, or raped, or abruptly loses a job or loved one— we recognize these things as traumas. We recognize that these events take some extra work to recover from. In the aftermath of the election I have seen many of the symptoms of trauma in the people around me. Depression, hypervigilance, difficulty concentrating, short tempers, and so on. When trauma hits, our usual coping mechanisms often fail or go haywire. A drink or two to unwind, turns into bleary drunkenness every night. Playing games to let go, turns into escapism to avoid thinking. Solitude, turns into reclusion. A healthy skepticism, turns into paranoia. If we do not recognize traumas for what they are, it becomes all too easy to find ourselves with even worse problems. Recognition is necessary for forming an appropriate response.

Now, the passage. As humans we have three habitual methods for relating to suffering. All three are ineffectual at reducing that suffering. These three ineffectual strategies are: attacking, indulging, and ignoring. And I’ve seen all three in great quantities in all the OpEd pieces floating around over the past week.

By “attacking” Pema Chödrön means not just lashing out, attacking Trump’s supporters or their ideals, but also all the ways we attack ourselves: We condemn ourselves, criticize ourselves for any indulgence, pity ourselves to the point of not getting out of bed. This strategy shows up in all those articles criticizing us for not having interpreted the polls correctly, or chastising us for not voting, or condemning the way the internet has formed these echo-chamber bubbles, and so on. But all this self-flagellation, all this beating ourselves up, does nothing to heal our pain. Now we suffer not only from our fears of what’s to come, but also because “it’s all our fault”. We refuse to “let ourselves off easy”, so whenever someone tries to address our pain we attack them and beat them away, protecting our pain because we feel like we deserve it.

Indulging is just as common. Though we are haunted by self-doubt, we condone our behavior. We say “I don’t deserve this discomfort. I have plenty of reasons to be angry or sleep all day.” We justify our pain to the point of turning it into a virtue and applauding ourselves. This strategy shows up in all those articles that relish in the details of how bad things will become, or congratulating ourselves for saying something like this would happen. But again, by cherishing our pain and presenting it as something to be praised, we are preventing ourselves from healing. Noone wants to give up something they cherish, nor give up on all the attention and sympathy they are lavished with.

Ignoring is no less common. “Ignoring” means not just refusing to acknowledge our pain and fear, but also pretending it doesn’t exist, dissociating, spacing out, going numb, acting on autopilot, or any of the other ways to try to keep our suffering out of sight and out of mind. This strategy is advocated by all those articles talking about how things actually aren’t that bad, or how this is just business as usual, or how it’ll all get better once the mid-term elections happen. While ignoring seems effective in the short term, it does nothing to address the suffering you feel. In addition to not healing that initial wound, it creates more pain as we inevitably force ourselves into tighter and tighter spaces in order to keep it out of mind.

There is an alternative to these three futile strategies. The enlightened strategy is to try fully experiencing whatever you’ve been resisting— without exiting in your habitual way. Become inquisitive about your habits. Recognize when you are pushing your suffering away, or embracing it, or denying it. Become inquisitive about your suffering. What is it, exactly, that you are denying? Why does it feel so urgent to push it away? Why does it feel so necessary to cling to it? Stop trying to justify your feelings, stop trying to explain them. Start instead to look at them, to see them for what they really are. Ask why it is they hurt, what part of your ego they compromise, what ideals they belie.

The passage on the three futile strategies follows a koan about “heaven and hell”. From a buddhist perspective, “hell” is not a place, it is all the feelings of pain and fear and suffering we experience. Nor is “heaven” a place, but rather all our feelings of gratitude and joy and understanding. Thus, the buddhist does not say “hell is bad and heaven is good” nor “get rid of hell and just seek heaven”. Rather, one should approach all things with an open mind, greeting both heaven and hell with that openness. In her words,

Only with this kind of equanimity can we realize that no matter what comes along, we’re always standing in the middle of a sacred space. Only with equanimity can we see that everything that comes into our circle has come to teach us what we need to know.

I find these words powerfully healing. It is healing to remember that no matter where we are or what befalls us, our life is a blessing, and in virtue of that blessing our bodies and the places we move through are sacred spaces. The sacred is not something which exists without us, but something which is created from within. Moreover, it is healing to step away from questions like “what did I do to deserve this?” and instead remember to ask what it is we can learn from the experience.

I have endured many traumas in my life, and I half expected the election outcome, but still it felt like a kick in the chest. This wound brought back all my darkest habits. Once I recovered from the shock enough to begin the rituals of healing and self-care, I reflected on the question of why this particular wound hurt so bad. In my experience (and not just because I’m buddhist), deep emotional pain always stems from some threat to one’s ego; so what part of my ego is on the line? For me, the reason the election hurt so much is because I had become complacent in believing that the world is steadily becoming a more just place and believing that people are by-and-large fundamentally good. With the election of Obama, the passing of the ACA, the supreme court ruling on Obergefell v. Hodges, and so on, I think a lot of us on the progressive side have been susceptible to those beliefs. The election hurt so much, for me, because it forced the recognition that it’s not just the legacy of systemic institutionalized hatred we must fight, but that over a quarter of the population actively supports the worst extremes of that hatred. Yes, the election itself was offensive. Yes, I fear for my life and the lives of those close to me. But the real root of the pain itself, the reason it hurt so bad, is this refutation of those optimistic beliefs about humanity and the path towards justice. Realizing that this was the root cause of my pain did a lot to help me process it and move on. It also gave a healthy way to shift focus from the pain itself, to something actionable. Having experienced the pain, I can accept it. And having learned what it has to teach me, I know what I must do.

So sit with your pain, and try to experience it fully. Stop pushing it away. Stop embracing it. Stop beating yourself up over it. Approach it with an open mind and let it pass through you. And, finally, ask yourself what you can learn from it.

Twitter Facebook Google+ Tumblr WordPress

comments
Categories: Offsite Blogs

FP Complete: Devops best practices: Immutability

Planet Haskell - Sun, 11/13/2016 - 12:00pm

FP Complete builds on cutting-edge open-source devops technologies, providing devops solutions and consulting to a number of companies in life sciences & health, financial services, and secure Internet services. This exposure has given us a chance to work with some of the best engineering practices in devops.

Contact us if we can help you too

As we bring more companies forward into the world of devops, we will continue to share lessons learned and best practices with the IT community. Today’s best practice: immutability.

What is immutable?

As a software engineering concept, immutability means that once you assign a value or configuration to some entity, you never modify it in place -- if you want it to change, you make a new one and (optionally) tear down the old one.

As we know from functional programming (FP) and our work with Haskell, immutability boosts the reliability and predictability of system behavior -- preventing bugs and downtime, and increasing the speed and reproducibility of software development and deployment. A variable, program, or server is immutable if we guarantee, once it’s set up, that we won’t modify it in place. We can do this if replacements and deployments are cheap.

Devops has made it so much cheaper to build and deploy new software services and even whole servers and clusters, some companies are now taking advantage of immutability -- leading to much more reliable online services with less downtime and more frequent, reliable, repeatable updates.

Old-fashioned servers are mutable

Mutability describes how most traditional software, and most traditional server operations, are done. It means that once you create something, to avoid the terrible cost of creating another one, you just keep modifying the one you already put in place. Software patches, configuration file changes, even changing the value of a variable that’s already in use -- all these are examples of mutability. It’s always been a bit risky, but can be economical when (1) the cost of creating an entity is very high, and (2) the cost of bugs and mistakes is very low.

Unfortunately, mutability is a key cause of bugs and mistakes.

Consider this with a person instead of a computer. If I break my arm (an accidental mutation of my state), we could try to fix the problem using an immutable method: make a new Aaron, identical to the old one, but with a non-broken arm -- then tear down the old Aaron who is no longer needed. Obviously we just don’t have the technology to do this -- it’s beyond unaffordable -- so we are forced to use mutability. We patch up my arm, and wait for it to heal.

That’s way better than giving up, but now our managed service (Aaron) is in an unprecedented, irreproducible state. For weeks my arm is offline, in repair mode, while the rest of me runs. And for the rest of my life I may have to keep track of the fact that this arm is a little weaker, and there are some things I cannot do. My boss now has to remember: Aaron has this special flag that says he can’t lift some kinds of heavy boxes. What a pain. At least humans are flexible, so my colleagues won't just break with an "Error 504" when they try to shake my hand.

If only I could reconstruct the arm in its original state -- or even a whole new Aaron -- life would be so much easier. We may never have that for humans, certainly not in my lifetime. But thanks to modern devops technologies, we do have that option for servers. They don’t need to be modified in place, and they don’t need to run in unprecedented, irreproducible configurations that lead to many of today’s sysadmin emergencies, security breaches, and downtime.

How do we make servers immutable?

Our FP Deploy approach to devops is based on heavy use of containers (notably Docker), virtual machines, and where feasible, virtual networking, cloud computing (AWS, Azure, etc.), and virtual private clouds (VPCs). Every one of these technologies has something in common: it allows us to abstract away the work of creating a running online service. Configurations can be written declaratively, put under source control (say, in a Git repository), and run at any time (using, say, Kubernetes).

You want another server? Just run the deploy command again. You want another whole cluster (a “device” made of multiple servers and associated networking and data connections)? Just run that deploy command again.

By slashing the cost of deployment, we make it possible to create a whole new server painlessly, cheaply, and reproducibly. Developers just delivered a bug fix? Don’t patch the application server! Bring up a new instance based on the new software build. Once you’re happy that it’s running properly, bring down the old, less-good instance.

(In a future post we’ll talk about blue-green deployments and canary deployments -- cost-effective, easy techniques for making this transition cautiously and gracefully.)

An immutable server has a known, well-understood, source-controlled, reproducible configuration. No footnotes. We can be confident that the new production servers are the same as the engineering test servers, because they were created by running exactly the same deployment files -- not by a series of manual admin tweaks that could be incorrect or have latent side effects.

This also makes it easy to recover from disaster, or scale up for increased load, by redeploying a new server from the same deployment files.

We can afford to do this only because modern devops makes it so cheap to create new servers. When doing it right is automated, predictable, repeatable, and inexpensive, there’s no longer any reason to do it wrong.

Can whole clusters, or distributed devices, be immutable?

It’s easy to assert that an application server can be made immutable, because well-architected web app servers are fairly self-contained and fairly stateless. But what if you are making larger changes to a whole distributed device, consisting of many servers and network connections? What if, for example, you have made matching changes in both your front-end server and your back-end server? Or in any group of services in a service-oriented architecture (SOA)?

Then you step up to immutable devices, immutable clusters or VPCs or distributed systems, using exactly the same methodology. At FP Complete we routinely create whole 10+ virtual machine devices on command, even just for testing purposes, because we’ve automated it with FP Deploy. Again, why not do it right? Why not be confident that the whole distributed system is in a known, reproducible state? We can be much more assured that what worked in test is going to work in production.

However, sometimes we don’t have that luxury, for example if we are retrofitting modern devops onto parts of an older system that has not been uniformly upgraded -- or in any case where the service we are upgrading is far cheaper to redeploy than another, perhaps more stateful, service in the same distributed system. That’s ok: we can have immutable servers as parts of a mutable device. The administrator of the distributed system now has to track which services have been replaced with newer versions, but at least for any given service the servers can be immutable.

What about the database server?

Once we have made it cheap to rebuild and replace servers at any time, immutability can be a reality, and reliability and reproducibility go way up. However, not all servers are so easily replaced. In fact, servers providing key input and output channels may be outside the scope of our control -- so all we can do is treat them as external to the immutable device, and understand that hooking up the inputs and outputs needs to be part of the declarative script used to bring up any new version of our device.

The strongest example of this may be an enterprise database server. We typically have no intention of building a whole new database as part of our application build-and-deploy process. Databases are typically long-running and, fundamental to their purpose, they are extremely stateful, extremely mutable. Cloud services such as RDS make it easy to spin up new database instances, but often the contents of an enterprise database are too large or fast-changing to want to rehost -- or we just don’t have permission to do so. Instead, we leave it in place and accept that its contents are very mutable.

So even when we use immutability to make our application server clusters easier to upgrade and less prone to errors, we need to understand that they will almost always be connected up to other servers that lack this golden property. Automated deployment with modern devops, ideally with truly immutable servers, can ensure that your system looks just like the system that worked in test -- eliminating a lot of surprise downtime and deployment failures. But even at companies with modern devops, failures still happen -- and when they do, it’s because the test system was not exposed to the same kinds of inputs and outputs, and the same database state, as the production system. In a future blog post, we’ll look at some best practices for testing and quality control.

Have a look at your own software deployment practices. Are there too many (more than zero) manual steps involved in bringing up a new or updated server? Do you allow, or even require, sysadmins to make changes to servers that are already up and running? Maybe it's time to use automated, reproducible deployment, and make the move to immutable servers.

For more assistance

In addition to our customizable FP Deploy devops solution, FP Complete offers consulting services, from advice to hands-on engineering, to help companies migrate to modern devops.

Contact us for a free consultation

Categories: Offsite Blogs

Mark Jason Dominus: The worst literature reference ever

Planet Haskell - Fri, 11/11/2016 - 4:52am

I think I may have found the single worst citation on Wikipedia. It's in the article on sausage casing. There is the following very interesting claim:

Reference to a cooked meat product stuffed in a goat stomach like a sausage was known in Babylon and described as a recipe in the world’s oldest cookbook 3,750 years ago.

That was exciting, and I wanted to know more. And there was a citation, so I could follow up!

The citation was:

(Yale Babylonian collection, New Haven Connecticut, USA)

I had my work cut out for me. All I had to do was drive up to New Haven and start translating their 45,000 cuneiform tablets until I found the cookbook.

(I tried to find a better reference, and turned up the book The Oldest Cuisine in the World: Cooking in Mesopotamia. The author, Jean Bottéro, was the discoverer of the cookbook, or rather he was the person who recognized that this tablet was a cookbook and not a pharmacopoeia or whatever. If the Babylonian haggis recipe is anywhere, it is probably there.)

Categories: Offsite Blogs

Derek Elkins: Constant-time Binary Logarithm

Planet Haskell - Thu, 11/10/2016 - 1:48am
Introduction

I’ve been watching the Spring 2012 lectures for MIT 6.851 Advanced Data Structures with Prof. Erik Demaine. In lecture 12, “Fusion Trees”, it mentions a constant time algorithm for finding the index of the first most significant 1 bit in a word, i.e. the binary logarithm. Assuming word operations are constant time, i.e. in the Word RAM model, the below algorithm takes 27 word operations (not counting copying). When I compiled it with GHC 8.0.1 -O2 the core of the algorithm was 44 straight-line instructions. The theoretically interesting thing is, other than changing the constants, the same algorithm works for any word size that’s an even power of 2. Odd powers of two need a slight tweak. This is demonstrated for Word64, Word32, and Word16. It should be possible to do this for any arbitrary word size w.

The clz instruction can be used to implement this function, but this is a potential simulation if that or a similar instruction wasn’t available. It’s probably not the fastest way. Similarly, find first set and count trailing zeros can be implemented in terms of this operation.

Code

Below is the complete code. You can also download it here.

{-# LANGUAGE BangPatterns #-} import Data.Word import Data.Bits -- Returns 0-based bit index of most significant bit that is 1. Assumes input is non-zero. -- That is, 2^indexOfMostSignificant1 x <= x < 2^(indexOfMostSignificant1 x + 1) -- From Erik Demaine's presentation in Spring 2012 lectures of MIT 6.851, particularly "Lecture 12: Fusion Trees". -- Takes 26 (source-level) straight-line word operations. indexOfMostSignificant1 :: Word64 -> Word64 indexOfMostSignificant1 w = idxMsbyte .|. idxMsbit where -- top bits of each byte !wtbs = w .&. 0x8080808080808080 -- all but top bits of each byte producing 8 7-bit chunks !wbbs = w .&. 0x7F7F7F7F7F7F7F7F -- parallel compare of each 7-bit chunk to 0, top bit set in result if 7-bit chunk was not 0 !pc = parallelCompare 0x8080808080808080 wbbs -- top bit of each byte set if the byte has any bits set in w !ne = wtbs .|. pc -- a summary of which bytes (except the first) are non-zero as a 7-bit bitfield, i.e. top bits collected into bottom byte !summary = sketch ne `unsafeShiftR` 1 -- parallel compare summary to powers of two !cmpp2 = parallelCompare 0xFFBF9F8F87838180 (0x0101010101010101 * summary) -- index of most significant non-zero byte * 8 !idxMsbyte = sumTopBits8 cmpp2 -- most significant 7-bits of most significant non-zero byte !msbyte = ((w `unsafeShiftR` (fromIntegral idxMsbyte)) .&. 0xFF) `unsafeShiftR` 1 -- parallel compare msbyte to powers of two !cmpp2' = parallelCompare 0xFFBF9F8F87838180 (0x0101010101010101 * msbyte) -- index of most significant non-zero bit in msbyte !idxMsbit = sumTopBits cmpp2' -- Maps top bits of each byte into lower byte assuming all other bits are 0. -- 0x2040810204081 = sum [2^j | j <- map (\i -> 49 - 7*i) [0..7]] -- In general if w = 2^(2*k+p) and p = 0 or 1 the formula is: -- sum [2^j | j <- map (\i -> w-(2^k-1) - 2^(k+p) - (2^(k+p) - 1)*i) [0..2^k-1]] -- Followed by shifting right by w - 2^k sketch w = (w * 0x2040810204081) `unsafeShiftR` 56 parallelCompare w1 w2 = complement (w1 - w2) .&. 0x8080808080808080 sumTopBits w = ((w `unsafeShiftR` 7) * 0x0101010101010101) `unsafeShiftR` 56 sumTopBits8 w = ((w `unsafeShiftR` 7) * 0x0808080808080808) `unsafeShiftR` 56 indexOfMostSignificant1_w32 :: Word32 -> Word32 indexOfMostSignificant1_w32 w = idxMsbyte .|. idxMsbit where !wtbs = w .&. 0x80808080 !wbbs = w .&. 0x7F7F7F7F !pc = parallelCompare 0x80808080 wbbs !ne = wtbs .|. pc !summary = sketch ne `unsafeShiftR` 1 !cmpp2 = parallelCompare 0xFF838180 (0x01010101 * summary) !idxMsbyte = sumTopBits8 cmpp2 !msbyte = ((w `unsafeShiftR` (fromIntegral idxMsbyte)) .&. 0xFF) `unsafeShiftR` 1 !cmpp2' = parallelCompare 0x87838180 (0x01010101 * msbyte) -- extra step when w is not an even power of two !cmpp2'' = parallelCompare 0xFFBF9F8F (0x01010101 * msbyte) !idxMsbit = sumTopBits cmpp2' + sumTopBits cmpp2'' sketch w = (w * 0x204081) `unsafeShiftR` 28 parallelCompare w1 w2 = complement (w1 - w2) .&. 0x80808080 sumTopBits w = ((w `unsafeShiftR` 7) * 0x01010101) `unsafeShiftR` 24 sumTopBits8 w = ((w `unsafeShiftR` 7) * 0x08080808) `unsafeShiftR` 24 indexOfMostSignificant1_w16 :: Word16 -> Word16 indexOfMostSignificant1_w16 w = idxMsnibble .|. idxMsbit where !wtbs = w .&. 0x8888 !wbbs = w .&. 0x7777 !pc = parallelCompare 0x8888 wbbs !ne = wtbs .|. pc !summary = sketch ne `unsafeShiftR` 1 !cmpp2 = parallelCompare 0xFB98 (0x1111 * summary) !idxMsnibble = sumTopBits4 cmpp2 !msnibble = ((w `unsafeShiftR` (fromIntegral idxMsnibble)) .&. 0xF) `unsafeShiftR` 1 !cmpp2' = parallelCompare 0xFB98 (0x1111 * msnibble) !idxMsbit = sumTopBits cmpp2' sketch w = (w * 0x249) `unsafeShiftR` 12 parallelCompare w1 w2 = complement (w1 - w2) .&. 0x8888 sumTopBits w = ((w `unsafeShiftR` 3) * 0x1111) `unsafeShiftR` 12 sumTopBits4 w = ((w `unsafeShiftR` 3) * 0x4444) `unsafeShiftR` 12
Categories: Offsite Blogs

Chung-chieh Shan: Very far away from anywhere else

Planet Haskell - Thu, 11/10/2016 - 12:11am

My experience immigrating to a foreign place is inextricable from my experience growing up. Somewhere along the way I learned the habit of recoiling from disappointment by lowering my expectations to match. I shrunk myself in space and time, never delaying others by holding the subway door open. As I practiced attending to the present and the internal, I dissociated from planning for the future and trusting the external. What’s beyond my arm’s reach is not my home and I don’t belong there. If I don’t have time to make a vegetable available by growing it myself, then it doesn’t deserve to be my comfort food.

People say they don’t recognize their country anymore. I just realized that in my case, it’s myself that I don’t recognize anymore.

Categories: Offsite Blogs

Brent Yorgey: The divided difference track

Planet Haskell - Tue, 11/08/2016 - 9:17pm

My wife and son made a train track corresponding to the regular expression of divided differences, :


Categories: Offsite Blogs

FP Complete: Covariance and Contravariance

Planet Haskell - Tue, 11/08/2016 - 8:00pm

Typeclasses such as Bifunctor are often expressed in terms of whether they are covariant or contravariant. While these terms may appear intimidating to the unfamiliar, they are a precise language for discussing these concepts, and once explained are relatively easy to understand. Furthermore, the related topics of positive and negative position can greatly simplify how you think about complex data structures. This topic also naturally leads into subtyping.

This post is intended to give a developer-focused explanation of the terms without diving into the category theory behind them too much. For more information, please see the Wikipedia page on covariance and contravariance.

This blog post is also part of the FP Complete Haskell Syllabus and part of our Haskell training.

The Functor typeclass: covariant functor

Let's consider the following functions (made monomorphic for clarity):

showInt :: Int -> String showInt = show floorInt :: Double -> Int floorInt = floor

Now suppose that we have a value:

maybeInt :: Maybe Int maybeInt = Just 5

We know Maybe is an instance of Functor, providing us with the following function:

fmapMaybe :: (a -> b) -> Maybe a -> Maybe b fmapMaybe = fmap

We can use fmapMaybe and showInt together to get a new, valid, well-typed value:

maybeString :: Maybe String maybeString = fmapMaybe showInt maybeInt

However, we can't do the same thing with floorInt. The reason for this is relatively straightforward: in order to use fmapMaybe on our Maybe Int, we need to provide a function that takes an Int as an input, whereas floorInt returns an Int as an output. This is a long-winded way of saying that Maybe is covariant on its type argument, or that the Functor typeclass is a covariant functor.

Doesn't make sense yet? Don't worry, it shouldn't. In order to understand this better, let's contrast it with something different.

A non-covariant data type

Consider the following data structure representing how to create a String from something:

newtype MakeString a = MakeString { makeString :: a -> String }

We can use this to convert an Int into a String:

newtype MakeString a = MakeString { makeString :: a -> String } showInt :: MakeString Int showInt = MakeString show main :: IO () main = putStrLn $ makeString showInt 5

The output for this program is, as expected, 5. But suppose we want to both add 3 to the Int and turn it into a String. We can do:

newtype MakeString a = MakeString { makeString :: a -> String } plus3ShowInt :: MakeString Int plus3ShowInt = MakeString (show . (+ 3)) main :: IO () main = putStrLn $ makeString plus3ShowInt 5

But this approach is quite non-compositional. We'd ideally like to be able to just apply more functions to this data structure. Let's first write that up without any typeclasses:

newtype MakeString a = MakeString { makeString :: a -> String } mapMakeString :: (b -> a) -> MakeString a -> MakeString b mapMakeString f (MakeString g) = MakeString (g . f) showInt :: MakeString Int showInt = MakeString show plus3ShowInt :: MakeString Int plus3ShowInt = mapMakeString (+ 3) showInt main :: IO () main = putStrLn $ makeString plus3ShowInt 5

But this kind of mapping inside a data structure is exactly what we use the Functor type class for, right? So let's try to write an instance!

instance Functor MakeString where fmap f (MakeString g) = MakeString (g . f)

Unfortunately, this doesn't work:

Main.hs:4:45: Couldn't match type ‘b’ with ‘a’ ‘b’ is a rigid type variable bound by the type signature for fmap :: (a -> b) -> MakeString a -> MakeString b at Main.hs:4:5 ‘a’ is a rigid type variable bound by the type signature for fmap :: (a -> b) -> MakeString a -> MakeString b at Main.hs:4:5 Expected type: b -> a Actual type: a -> b Relevant bindings include g :: a -> String (bound at Main.hs:4:24) f :: a -> b (bound at Main.hs:4:10) fmap :: (a -> b) -> MakeString a -> MakeString b (bound at Main.hs:4:5) In the second argument of ‘(.)’, namely ‘f’ In the first argument of ‘MakeString’, namely ‘(g . f)’

To understand why, let's compare the type for fmap (specialized to MakeString) with our mapMakeString type:

mapMakeString :: (b -> a) -> MakeString a -> MakeString b fmap :: (a -> b) -> MakeString a -> MakeString b

Notice that fmap has the usual a -> b parameter, whereas mapMakeString instead has a b -> a, which goes in the opposite direction. More on that next.

Exercise: Convince yourself that the mapMakeString function has the only valid type signature we could apply to it, and that the implementation is the only valid implementation of that signature. (It's true that you can change the variable names around to cheat and make the first parameter a -> b, but then you'd also have to modify the rest of the type signature.)

Contravariance

What we just saw is that fmap takes a function from a -> b, and lifts it to f a -> f b. Notice that the a is always the "input" in both cases, whereas the b is the "output" in both cases. By contrast, mapMakeString has the normal f a -> f b, but the initial function has its types reversed: b -> a. This is the core of covariance vs contravariance:

  • In covariance, both the original and lifted functions point in the same direction (from a to b)
  • In contravariance, the original and lifted functions point in opposite directions (one goes from a to b, the other from b to a)

This is what is meant when we refer to the normal Functor typeclass in Haskell as a covariant functor. And as you can probably guess, we can just as easily define a contravariant functor. In fact, it exists in the contravariant package. Let's go ahead and use that typeclass in our toy example:

import Data.Functor.Contravariant newtype MakeString a = MakeString { makeString :: a -> String } instance Contravariant MakeString where contramap f (MakeString g) = MakeString (g . f) showInt :: MakeString Int showInt = MakeString show plus3ShowInt :: MakeString Int plus3ShowInt = contramap (+ 3) showInt main :: IO () main = putStrLn $ makeString plus3ShowInt 5

Our implementation of contramap is identical to the mapMakeString used before, which hopefully isn't too surprising.

Example: filtering with Predicate

Let's say we want to print out all of the numbers from 1 to 10, where the English word for that number is more than three characters long. Using a simple helper function english :: Int -> String and filter, this is pretty simple:

greaterThanThree :: Int -> Bool greaterThanThree = (> 3) lengthGTThree :: [a] -> Bool lengthGTThree = greaterThanThree . length englishGTThree :: Int -> Bool englishGTThree = lengthGTThree . english english :: Int -> String english 1 = "one" english 2 = "two" english 3 = "three" english 4 = "four" english 5 = "five" english 6 = "six" english 7 = "seven" english 8 = "eight" english 9 = "nine" english 10 = "ten" main :: IO () main = print $ filter englishGTThree [1..10]

The contravariant package provides a newtype wrapper around such a -> Bool functions, called Predicate. We can use this newtype to wrap up our helper functions and avoid explicit function composition:

import Data.Functor.Contravariant greaterThanThree :: Predicate Int greaterThanThree = Predicate (> 3) lengthGTThree :: Predicate [a] lengthGTThree = contramap length greaterThanThree englishGTThree :: Predicate Int englishGTThree = contramap english lengthGTThree english :: Int -> String english 1 = "one" english 2 = "two" english 3 = "three" english 4 = "four" english 5 = "five" english 6 = "six" english 7 = "seven" english 8 = "eight" english 9 = "nine" english 10 = "ten" main :: IO () main = print $ filter (getPredicate englishGTThree) [1..10]

NOTE: I'm not actually recommending this as a better practice than the original, simpler version. This is just to demonstrate the capability of the abstraction.

Bifunctor and Profunctor

We're now ready to look at something a bit more complicated. Consider the following two typeclasses: Profunctor and Bifunctor. Both of these typeclasses apply to types of kind * -> * -> *, also known as "a type constructor that takes two arguments." But let's look at their (simplified) definitions:

class Bifunctor p where bimap :: (a -> b) -> (c -> d) -> p a c -> p b d class Profunctor p where dimap :: (b -> a) -> (c -> d) -> p a c -> p b d

They're identical, except that bimap takes a first parameter of type a -> b, whereas dimap takes a first parameter of type b -> a. Based on this observation, and what we've learned previously, we can now understand the documentation for these two typeclasses:

Bifunctor: Intuitively it is a bifunctor where both the first and second arguments are covariant.

Profunctor: Intuitively it is a bifunctor where the first argument is contravariant and the second argument is covariant.

These are both bifunctors since they take two type parameters. They both treat their second parameter in the same way: covariantly. However, the first parameter is treated differently by the two: Bifunctor is covariant, and Profunctor is contravariant.

Exercise Try to think of a few common datatypes in Haskell that would be either a Bifunctor or Profunctor, and write the instance.

Hint Some examples are Either, (,), and -> (a normal function from a to b). Figure out which is a Bifunctor and which is a Profunctor.

Solution

class Bifunctor p where bimap :: (a -> b) -> (c -> d) -> p a c -> p b d class Profunctor p where dimap :: (b -> a) -> (c -> d) -> p a c -> p b d instance Bifunctor Either where bimap f _ (Left x) = Left (f x) bimap _ f (Right x) = Right (f x) instance Bifunctor (,) where bimap f g (x, y) = (f x, g y) instance Profunctor (->) where -- functions dimap f g h = g . h . f

Make sure you understand why these instances work the way they do before moving on.

Bivariant and invariant

There are two more special cases for variance: bivariant means "both covariant and contravariant," whereas invariant means "neither covariant nor contravariant." The only types which can be bivariant are phantoms, where the type doesn't actually exist. As an example:

import Data.Functor.Contravariant (Contravariant (..)) data Phantom a = Phantom instance Functor Phantom where fmap _ Phantom = Phantom instance Contravariant Phantom where contramap _ Phantom = Phantom

Invariance will occur if:

  • A type parameter is used multiple times in a data structure, both positively and negatively, e.g.:

    data ToFrom a = ToFrom (a -> Int) (Int -> a)
  • A type parameter is used in type which is itself invariant in the parameter, e.g.:

    newtype ToFromWrapper a = ToFromWrapper (ToFrom a)

    Note that even though the parameter only appears once here, it appears twice in ToFrom itself.

  • In special types like references, e.g.:

    data IORef a -- a is invariant newtype RefWrapper a = RefWrapper (IORef a) -- also invariant

Exercise Convince yourself that you can not make an instance of either Functor nor Contravariant for ToFrom or IORef.

Exercise Explain why there's also no way to make an instance of Bifunctor or Profunctor for these datatypes.

As you can see, the a parameter is used as both the input to a function and output from a function in ToFrom. This leads directly to our next set of terms.

NOTE There's a good Reddit discussion which led to clarification of these section.

Positive and negative position

Let's look at some basic covariant and contravariant data types:

data WithInt a = WithInt (Int -> a) data MakeInt a = MakeInt (a -> Int)

By now, you should hopefully be able to identify that WithInt is covariant on its type parameter a, whereas MakeInt is contravariant. Please make sure you're confident of that fact, and that you know what the relevant Functor and Contravariant instance will be.

Can we give a simple explanation of why each of these is covariant and contravariant? Fortunately, yes: it has to do with the position the type variable appears in the function. In fact, we can even get GHC to tell us this by using Functor deriving:

{-# LANGUAGE DeriveFunctor #-} data MakeInt a = MakeInt (a -> Int) deriving Functor

This results in the (actually quite readable) error message:

Can't make a derived instance of ‘Functor MakeInt’: Constructor ‘MakeInt’ must not use the type variable in a function argument In the data declaration for ‘MakeInt’

Another way to say this is "a appears as an input to the function." An even better way to say this is that "a appears in negative position." And now we get to define two new terms:

  • Positive position: the type variable is the result/output/range/codomain of the function
  • Negative position: the type variable is the argument/input/domain of the function

When a type variable appears in positive position, the data type is covariant with that variable. When the variable appears in negative position, the data type is contravariant with that variable. To convince yourself that this is true, go review the various data types we've used above, and see if this logic applies.

But why use the terms positive and negative? This is where things get quite powerful, and drastically simplify your life. Consider the following newtype wrapper intended for running callbacks:

type Callback a = a -> IO () -- newtype CallbackRunner a = CallbackRunner (Callback a -> IO ()) -- Expands to: newtype CallbackRunner a = CallbackRunner ((a -> IO ()) -> IO ())

Is it covariant or contravariant on a? Your first instinct may be to say "well, a is a function parameter, and therefore it's contravariant. However, let's break things down a bit further.

Suppose we're just trying to deal with a -> IO (). As we've established many times above: this function is contravariant on a, and equivalently a is in negative position. This means that this function expects on input of type a.

But now, we wrap up this entire function as the input to a new function, via: (a -> IO ()) -> IO (). As a whole, does this function consume an a, or does it produce an a? To get an intuition, let's look at an implementation of CallbackRunner Int for random numbers:

supplyRandom :: CallbackRunner Int supplyRandom = CallbackRunner $ \callback -> do int <- randomRIO (1, 10) callback int

It's clear from this implementation that supplyRandom is, in fact, producing an Int. This is similar to Maybe, meaning we have a solid argument for this also being covariant. So let's go back to our positive/negative terminology and see if it explains why.

In a -> IO (), a is in negative position. In (a -> IO ()) -> IO (), a -> IO () is in negative position. Now we just follow multiplication rules: when you multiply two negatives, you get a positive. As a result, in (a -> IO ()) -> IO (), a is in positive position, meaning that CallbackRunner is covariant on a, and we can define a Functor instance. And in fact, GHC agrees with us:

{-# LANGUAGE DeriveFunctor #-} import System.Random newtype CallbackRunner a = CallbackRunner { runCallback :: (a -> IO ()) -> IO () } deriving Functor supplyRandom :: CallbackRunner Int supplyRandom = CallbackRunner $ \callback -> do int <- randomRIO (1, 10) callback int main :: IO () main = runCallback supplyRandom print

Let's unwrap the magic, though, and define our Functor instance explicitly:

newtype CallbackRunner a = CallbackRunner { runCallback :: (a -> IO ()) -> IO () } instance Functor CallbackRunner where fmap f (CallbackRunner aCallbackRunner) = CallbackRunner $ \bCallback -> aCallbackRunner (bCallback . f)

Exercise 1: Analyze the above Functor instance and understand what is occurring.

Exercise 2: Convince yourself that the above implementation is the only one that makes sense, and similarly that there is no valid Contravariant instance.

Exercise 3: For each of the following newtype wrappers, determine if they are covariant or contravariant in their arguments:

newtype E1 a = E1 (a -> ()) newtype E2 a = E2 (a -> () -> ()) newtype E3 a = E3 ((a -> ()) -> ()) newtype E4 a = E4 ((a -> () -> ()) -> ()) newtype E5 a = E5 ((() -> () -> a) -> ()) -- trickier: newtype E6 a = E6 ((() -> a -> a) -> ()) newtype E7 a = E7 ((() -> () -> a) -> a) newtype E8 a = E8 ((() -> a -> ()) -> a) newtype E9 a = E8 ((() -> () -> ()) -> ())Lifting IO to MonadIO

Let's look at something seemingly unrelated to get a feel for the power of our new analysis tools. Consider the base function openFile:

openFile :: FilePath -> IOMode -> IO Handle

We may want to use this from a monad transformer stack based on top of the IO monad. The standard approach to that is to use the MonadIO typeclass as a constraint, and its liftIO function. This is all rather straightforward:

import System.IO import Control.Monad.IO.Class openFileLifted :: MonadIO m => FilePath -> IOMode -> m Handle openFileLifted fp mode = liftIO (openFile fp mode)

But of course, we all prefer using the withFile function instead of openFile to ensure resources are cleaned up in the presence of exceptions. As a reminder, that function has a type signature:

withFile :: FilePath -> IOMode -> (Handle -> IO a) -> IO a

So can we somehow write our lifted version with type signature:

withFileLifted :: MonadIO m => FilePath -> IOMode -> (Handle -> m a) -> m a

Try as we might, this can't be done, at least not directly (if you're really curious, see lifted-base and its implementation of bracket). And now, we have the vocabulary to explain this succinctly: the IO type appears in both positive and negative position in withFile's type signature. By contrast, with openFile, IO appears exclusively in positive position, meaning our transformation function (liftIO) can be applied to it.

Like what you learned here? Please check out the rest of our Haskell Syllabus or learn about FP Complete training.

Categories: Offsite Blogs

Sandy Maguire: Book Announcement

Planet Haskell - Tue, 11/08/2016 - 6:00pm
<article> <header> Book Announcement </header>

<time>November 9, 2016</time> meta, announcements

Last night something strange happened. For a brief moment, the internet somehow thought that my criticism of Elm was more important to discuss than the presidential election, because it was at the #1 spot on Hacker News. I’m not 100% sure how things end up at the #1 spot on Hacker News, but it sounds like a pretty desirable place to be.

My traffic yesterday was up three orders of magnitude from its average, so it seems like now’s as good a time as any to announce my new project:

I’m writing a book! It’s a gentle introduction to computer science, from first principles of electronics to category theory. If that sounds like the kind of thing you might be into, you can find it here.

</article>
Categories: Offsite Blogs

Philip Wadler: Anti-semitism, conjured and real

Planet Haskell - Mon, 11/07/2016 - 9:10am

Accusations of anti-semitism in the Labour party have gone virtually unchallenged, which is unconscionable because almost all of what is referred to as 'anti-semitism' is simply legitimate protest against Israel's oppression of Palestinians. David Plank at Jews Sans Frontiers has just published a thorough debunking
I've been lucky to rarely face anti-semitism in my personal life. So its salutary to be reminded the extent to which it actually exists in the world. If nothing else, this is something that Donald Trump does well.
Trump's campaign is based on dog-whistle racism, including anti-semitism, as called out in An Open Letter to Jared Kushner from one of your Employees and, more humorously, by Jon Stewart in The Day I Woke Up To Find Out Somebody Was Tweeting Weird Shit About Me.
 
Of course, many others than Jews have faced the same racism, as noted in The Price I’ve Paid for Opposing Donald Trump.
The issues at stake have been eloquently stated, more forthrightly than in most media, by Adam Gopnik in A Point of View. I expect most folk reading this will not be supporters of Trump, but, if you are, please listen to it before you vote.



Categories: Offsite Blogs

Ken T Takusagawa: [wisnmyni] Data as a number

Planet Haskell - Sun, 11/06/2016 - 10:04pm

Convert a number with possibly many leading zeroes in base M to base N, prefixing the base N output representation with leading zeroes in a way that unambigiously specifies the number of leading zeroes in base M input.  I think this is possible when M > N.  Some potentially useful conversions:

(M=16, N=10); (M=256, N=10); (M=256, N=100); (M=10; N=2)

The deeper idea is, whenever a string of characters represents raw data and not a word in a natural language, it should be encoded so that it is clear that the string is not meant to be interpreted as a word in natural language.  Unannotated hexadecimal fails this rule: if you see the string "deadbeef", is it a word, with a meaning perhaps related to food, or is it a number?  (Annotated hexadecimal, e.g., "0xdeadbeef" is clearly not a word.)  Simpler example: what does "a" mean, indefinite article or ten in hexadecimal?  English, and other orthographies which use Hindu-Arabic numerals, already have a character set which unambiguously state that a string encodes data and not words: the numerals.  (One could argue that numbers -- strings of Hindu-Arabic numerals -- have more meaning than strictly data: they have ordinality, they obey group and field axioms, etc.  However, we frequently see numbers which aren't that, e.g., serial numbers, phone numbers, ZIP codes, ID and credit card numbers.)

The inspiration was, expressing hashes in hexadecimal is silly.  Radix conversion is quick and easy for a computer; it should be in decimal with leading zeroes if necessary.  If making the hash compact is a goal, then base 26 or base 95, with some annotation signifying it is encoded data, is better than hexadecimal.

Some Haskell code demonstrating some of the conversions.

Categories: Offsite Blogs

FP Complete: Exceptions Best Practices in Haskell

Planet Haskell - Sun, 11/06/2016 - 8:00pm

Over the years, I've written a number of different documents, tutorials, comments, and libraries on how to do proper exception handling in Haskell. Most of this has culminated in the creation of safe-exceptions library, which I strongly recommend everyone use. That library contains a full tutorial which explains many of the more subtle points of exceptions, and describes how exceptions are handled by that library.

Overall, I consider that library uncontroversial, simply addressing the reality of exceptions in GHC today. This blog post is the opinionated part: how I recommend you use exceptions in Haskell, and how to structure your code around them. There are dissenting opinions, which is why this is an opinion blog post instead of library documentation. However, in my experience, these approaches are the best way to make your code robust.

This blog post is also part of the FP Complete Haskell Syllabus and part of our Haskell training.

The IO contract

A commonly stated position in the Haskell community around exceptions goes something like "all exceptions should be explicit at the type level, and async exceptions are terrible." We can argue as much as we want about this point in a theoretical sense. However, practically, it is irrelevant, because GHC has already chosen a stance on this: it supports async exceptions, and all code that runs in IO can have exceptions of any type which is an instance of Exception.

I'd go a step further, and say not only are we stuck with GHC's decisions, but GHC's decisions are a great point in the design space. I'll explain that below.

So take as a given: any code in IO can throw a runtime exception, and any thread can be killed at any time by an asynchronous exception.

Let's identify a few anti-patterns in Haskell exception handling, and then move on to recommended practices.

The badExceptT IO anti-pattern

A common (bad) design pattern I see is something like the following:

myFunction :: String -> ExceptT MyException IO Int

There are (at least) three problems with this:

  1. It's non-composable. If someone else has a separate exception type HisException, these two functions do not easily compose.
  2. It gives an implication which is almost certainly false, namely: the only exception that can be thrown from this function is MyException. Almost any IO code in there will have the ability to throw some other type of exception, and additionally, almost any async exception can be thrown even if no synchronous exception is possible.
  3. You haven't limited the possibility of exceptions, you've only added one extra avenue by which an exception can be thrown. myFunction can now either throwE or liftIO . throwIO.

It is almost always wrong to wrap an ExceptT, EitherT, or ErrorT around an IO-based transformer stack.

Separate issue: it's also almost always a bad idea to have such a concrete transformer stack used in a public-facing API. It's usually better to express a function in terms of typeclass requirements, using mtl typeclasses as necessary.

A similar pattern is

myFunction :: String -> ExceptT Text IO Int

This is usually done with the idea that in the future the error type will be changed from Text to something like MyException. However, Text may end up sticking around forever because it helps avoid the composition problems of a real data type. However that leads to expressing useful error data types as unstructured Text.

Generally the solution to the ExceptT IO anti-pattern is to return an Either from more functions and throw an exception for uncommon errors. Note that returning Either from ExceptT IO means there are now 3 distinct sources of errors in just one function.

Please note that using ExceptT, etc with a non-IO base monad (for example with pure code) is a perfectly fine pattern.

Mask-them-all anti-pattern

This anti-pattern goes like this: remembering to deal with async exceptions everywhere is hard, so I'll just mask them all.

Every time you do this, 17 kittens are mauled to death by the loch ness monster.

Async exceptions may be annoying, but they are vital to keeping a system functioning correctly. The timeout function uses them to great benefit. The Warp webserver bases all of its slowloris protection on async exceptions. The cancel function from the async package will hang indefinitely if async exceptions are masked. Et cetera et cetera.

Are async exceptions difficult to work with? Sometimes, yes. Deal with it anyway. Best practices include:

  • Use the bracket pattern wherever possible.
  • Use the safe-exceptions package.
  • If you have truly complex flow of control and non-linear scoping of resources, use the resourcet package.
The goodMonadThrow

Consider the following function:

foo <- lookup "foo" m bar <- lookup "bar" m baz <- lookup "baz" m f foo bar baz

If this function returns Nothing, we have no idea why. It could be because:

  1. "foo" wasn't in the map.
  2. "bar" wasn't in the map.
  3. "baz" wasn't in the map.
  4. f returned Nothing.

The problem is that we've thrown away a lot of information by having our functions return Maybe. Instead, wouldn't it be nice if the types of our functions were:

lookup :: Eq k => k -> [(k, v)] -> Either (KeyNotFound k) v f :: SomeVal -> SomeVal -> SomeVal -> Either F'sExceptionType F'sResult

The problem is that these types don't unify. Also, it's commonly the case that we really don't need to know about why a lookup failed, we just need to deal with it. For those cases, Maybe is better.

The solution to this is the MonadThrow typeclass from the exceptions package. With that, we would write the type signatures as:

lookup :: (MonadThrow m, Eq k) => k -> [(k, v)] -> m v f :: MonadThrow m => SomeVal -> SomeVal -> SomeVal -> m F'sResult

Versus the Either signature, we lose some information, namely the type of exception that could be thrown. However, we gain composability and unification with Maybe (as well as many other useful instances of MonadThrow, like IO).

The MonadThrow typeclass is a tradeoff, but it's a well thought out tradeoff, and usually the right one. It's also in line with Haskell's runtime exception system, which does not capture the types of exceptions that can be thrown.

Transformers

The following type signature is overly restrictive:

foo :: Int -> IO String

This can always be generalized with a usage of liftIO to:

foo :: MonadIO m => Int -> m String

This allows our function to easily work with any transformer on top of IO. However, given how easy it is to apply liftIO, it's not too horrible a restriction. However, consider this function:

bar :: FilePath -> (Handle -> IO a) -> IO a

If you want your inner function to live in a transformer on top of IO, you'll find it difficult to make it work. It can be done with lifted-base, but it's non-trivial. Instead, it's much better to express this function in terms of functions from either the safe-exceptions library, and get the following more generalized type signatures:

bar :: (MonadIO m, MonadMask m) => FilePath -> (Handle -> m a) -> m a

This doesn't just apply to exception handling, but also to dealing with things like forking threads. Another thing to consider in these cases is to use the Acquire type from resourcet.

Custom exception types

The following is bad practice:

foo = do if x then return y else error "something bad happened"

The problem is the usage of arbitrary string-based error messages. This makes it difficult to handle this exceptional case directly in a higher level in the call stack. Instead, despite the boilerplate overhead involved, it's best to define a custom exception type:

data SomethingBad = SomethingBad deriving Typeable instance Show SomethingBad where show SomethingBad = "something bad happened" instance Exception SomethingBad foo = do if x then return y else throwM SomethingBad

Now it's trivial to catch the SomethingBad exception type at a higher level. Additionally, throwM gives better exception ordering guarantees than error, which creates an exception in a pure value that needs to be evaluated before it's thrown.

One sore point is that some people strongly oppose a Show instance like this. This is an open discussion, but for now I believe we need to make the tradeoff at this point in the spectrum. The displayException method in the Exception typeclass may allow for a better resolution to this point in the future.

Why GHC's point in the design space is great

This section is adapted from a comment I made on Reddit in 2014.

I don't believe there is a better solution to sync exceptions, actually. That's because most of the time I see people complaining about IO throwing exceptions, what they really mean is "this specific exception just bit me, why isn't this exception explicit in the type signature?" To clarify my point further:

  • There are virtually 0 IO actions that can't fail for some reason.
  • If every IO action returned a IO (Either UniqueExceptionType a), the programming model would become incredibly tedious. * Also, it would become very likely that when a is (), it would be easy to forget to check the return type to see if an exception occurred.
  • If instead every IO action returned IO (Either SomeException a), we'd at least not have to deal with wrangling different exception types, and could use ErrorT to make our code simpler, but...
  • Then we've just reinvented exactly what IO does today, only less efficiently!

My belief is that people are simply ignoring the reality of the situation: the contract for IO implicitly includes "this action may also fail." And I mean in every single case. Built in, runtime exceptions hide that in the type, but you need to be aware of it. Runtime exceptions also happen to be far more efficient than using ErrorT everywhere.

And as much as some people complain that exceptions are difficult to handle correctly, I highly doubt ErrorT or anything else would be easier to work with, we'd just be trading in a well-developed, mostly-understood system for a system we think we understand.

Concrete example: readLine

After a request on Twitter, I decided to add a little section here showing a pragmatic example: how should we implement a function to read a line from stdin and parse it? Let's start with a simpler question: how about just parsing an input String? We'd like to have a meaningful exception that tells us which value didn't parse (the input String) and what type we tried to parse it as. We'll implement this as a readM function:

#!/usr/bin/env stack -- stack --resolver lts-7.8 runghc --package safe-exceptions {-# OPTIONS_GHC -Wall -Werror #-} import Control.Exception.Safe (Exception, MonadThrow, SomeException, throwM) import Data.Typeable (TypeRep, Typeable, typeRep) import Text.Read (readMaybe) data ReadException = ReadException String TypeRep deriving (Typeable) instance Show ReadException where show (ReadException s typ) = concat [ "Unable to parse as " , show typ , ": " , show s ] instance Exception ReadException readM :: (MonadThrow m, Read a, Typeable a) => String -> m a readM s = res where res = case readMaybe s of Just x -> return x Nothing -> throwM $ ReadException s (typeRep res) main :: IO () main = do print (readM "hello" :: Either SomeException Int) print (readM "5" :: Either SomeException Int) print (readM "5" :: Either SomeException Bool) -- Also works in plain IO res1 <- readM "6" print (res1 :: Int) res2 <- readM "not an int" print (res2 :: Int) -- will never get called

This meets our criteria of having a generalizable function to multiple monads and useful exceptions. If we now make a readLine function that reads from stdin, we have essentially two different choices of type signature:

  • readLine1 :: (MonadIO m, MonadThrow n, Read a, Typeable a) => m (n a): With this signature, we're saying "the case of failure is pretty common, and we therefore don't want to mix it in with the same monad that's handling IO side-effects.
  • readLine2 :: (MonadIO m, MonadThrow m, Read a, Typeable a) => m a: By contrast, we can actually combine the two different monads (IO side-effects and failure) into one layer. This is implicitly saying "failure is a case we don't usually want to deal with, and therefore the user should explicitly use tryAny or similar to extract such failures. That said, in practice, there's not much point in having both MonadIO and MonadThrow, since you can just use liftIO to combine them (as you'll see in a moment). So instead, our signature is readLine2 :: (MonadIO m, Read a, Typeable a) => m a.

Which one of these you choose in practice really does depend on your personal preferences. The former is much more explicit about the failure. However, in general I'd steer away from it, since - like ExceptT over IO - it gives the false impression that all failures are captured by the inner value. Still, I thought it was worth demonstrating both options:

#!/usr/bin/env stack -- stack --resolver lts-7.8 runghc --package safe-exceptions {-# OPTIONS_GHC -Wall -Werror #-} import Control.Exception.Safe (Exception, MonadThrow, SomeException, throwM) import Control.Monad (join) import Control.Monad.IO.Class (MonadIO, liftIO) import Data.Typeable (TypeRep, Typeable, typeRep) import Text.Read (readMaybe) data ReadException = ReadException String TypeRep deriving (Typeable) instance Show ReadException where show (ReadException s typ) = concat [ "Unable to parse as " , show typ , ": " , show s ] instance Exception ReadException readM :: (MonadThrow m, Read a, Typeable a) => String -> m a readM s = res where res = case readMaybe s of Just x -> return x Nothing -> throwM $ ReadException s (typeRep res) readLine1 :: (MonadIO m, MonadThrow n, Read a, Typeable a) => m (n a) readLine1 = fmap readM (liftIO getLine) -- Without the usage of liftIO here, we'd need both MonadIO and -- MonadThrow constraints. readLine2 :: (MonadIO m, Read a, Typeable a) => m a readLine2 = liftIO (join readLine1) main :: IO () main = do putStrLn "Enter an Int (non-runtime exception)" res1 <- readLine1 print (res1 :: Either SomeException Int) putStrLn "Enter an Int (runtime exception)" res2 <- readLine2 print (res2 :: Int)See also

Like what you learned here? Please check out the rest of our Haskell Syllabus or learn about FP Complete training.

Categories: Offsite Blogs

Yesod Web Framework: Use MySQL Safely in Yesod Applications

Planet Haskell - Sun, 11/06/2016 - 3:39pm

With the latest version (0.1.4) of the mysql library, we now have the machinery needed to use it properly in a concurrent setting. In the past, any multi-threaded use was a little risky, although in practice it seems to have been satisfactory for applications which were not too demanding.

The necessary changes have just been made to the MySQL version of the scaffolding, and are described here. Existing Yesod sites should be updated in a similar manner. This post should give you all you need to know, but further background can be found in the MySQL manual and Roman Cheplyaka's blog.

But It Worked Anyway, Didn't It?

Let's start by reviewing why the mysql library works automatically in a single-threaded program, and why we might have got away with it most of the time in Yesod applications.

The underlying C library (libmysqlclient) requires a one-off initialisation, and then each thread in which it is called must be initialised to allocate some thread-local data. However, these actions are carried out automatically when a connect call is made, if they have not already been done. So nothing further is needed in a single-threaded program: a connect necessarily comes first, and it performs the required initialisations.

This behaviour of the connect call probably also explains why we have mostly got away with ignoring the problem in Yesod applications. Warp creates lightweight, Haskell threads by default, and these run in a rather small number of OS threads. When a new connection is opened and added to the pool, the OS thread running at the time will be initialised, as just described. Due to the small number of these threads, there is a reasonable chance that this is the first database action in each of them, resulting in correct initialisation. But there are no guarantees!

Correct Multi-Threaded Use

To be completely correct, we have to do all of the following:

  • Initialise the library as a whole.
  • Use bound threads for those which might perform database operations.
  • Initialise each thread properly.
  • Finalise each thread to free the memory used by its thread-local state.

The library initialisation is not thread-safe; it needs to be called separately to ensure that subsequent connect calls, occurring in multiple threads, detect that it has been done and do not repeat it themselves. This has been achieved in the scaffolding by calling MySQL.initLibrary from makeFoundation, before any database actions are carried out:

... import qualified Database.MySQL.Base as MySQL ... makeFoundation appSettings = do ... MySQL.initLibrary

The point about bound threads is that they provide a guarantee that related initialisation and database operations really do occur in the same OS thread. However, using them means that OS threads are created frequently, and the argument given above no longer applies, not even as an approximation: the threads definitely need explicit initialisation. They also need finalising to avoid a memory leak - again this is made important by the large number of threads. (There are some situations in which the finalisation can be omitted, but check the documentation carefully before doing so.)

The settings passed to warp can be used to make it spawn bound threads, instead of Haskell threads, and to specify functions to initialise and finalise them. This code shows how it is now done in the scaffolding, in Application.hs:

warpSettings foundation = ... $ setFork (\x -> void $ forkOSWithUnmask x) $ setOnOpen (const $ MySQL.initThread >> return True) $ setOnClose (const MySQL.endThread) defaultSettings

Warp forks a new thread to handle each connection, using the function specified by setFork. The functions passed to setOnOpen and setOnClose are called right at the start of processing the connection, and right at the end, so they are valid places to initialise and finalise the thread for use by the mysql library.

The argument to setFork is a function which creates bound threads. If you are wondering why it is written the way it is, instead of void . forkOSWithUnmask, it simply avoids the need for the ImpredicativeTypes language extension, which is considered fragile and is sometimes broken by new compiler releases!

Unfortunately, forkOSWithUnmask is not exported by Control.Concurrent until base-4.9 (ie GHC 8), so, when using earlier versions, we have to copy its definition into our code:

{-# LANGUAGE RankNTypes #-} ... import GHC.IO (unsafeUnmask) ... forkOSWithUnmask :: ((forall a . IO a -> IO a) -> IO ()) -> IO ThreadId forkOSWithUnmask io = forkOS (io unsafeUnmask)What About Efficiency?

OS threads are more expensive than Haskell threads, but the difference may not matter much compared to all the other processing which is going on in a real application. It would be wise to do some benchmarks before worrying about it!

One possible optimisation is to make sure that HTTP keepalive is used, since warp creates a thread per connection, not per request. Some reverse proxies might need explicit configuration for this.

Categories: Offsite Blogs

Proving Programs Correct Using Plain Old Java Types

Lambda the Ultimate - Fri, 11/04/2016 - 8:27am

Proving Programs Correct Using Plain Old Java Types, by Radha Jagadeesan, Alan Jeffrey, Corin Pitcher, James Riely:

Tools for constructing proofs of correctness of programs have a long history of development in the research community, but have often faced difficulty in being widely deployed in software development tools. In this paper, we demonstrate that the off-the-shelf Java type system is already powerful enough to encode non-trivial proofs of correctness using propositional Hoare preconditions and postconditions.

We illustrate the power of this method by adapting Fähndrich and Leino’s work on monotone typestates and Myers and Qi’s closely related work on object initialization. Our approach is expressive enough to address phased initialization protocols and the creation of cyclic data structures, thus allowing for the elimination of null and the special status of constructors. To our knowledge, our system is the first that is able to statically validate standard one-pass traversal algorithms for cyclic graphs, such as those that underlie object deserialization. Our proof of correctness is mechanized using the Java type system, without any extensions to the Java language.

Not a new paper, but it provides a lightweight verification technique for some program properties that you can use right now, without waiting for integrated theorem provers or SMT solvers. Properties that require only monotone typestates can be verified, ie. those that operations can only move the typestate "forwards".

In order to achieve this, they require programmers to follow a few simple rules to avoid Java's pervasive nulls. These are roughly: don't assign null explicitly, be sure to initialize all fields when constructing objects.

Categories: Offsite Discussion

Toby Goodwin: Debian chroot on Android

Planet Haskell - Fri, 11/04/2016 - 2:47am

Sometimes, a simple idea — so simple it can be distilled down to 4 words — can be truly astounding.

Why?

For quite a while, I've been considering the best way to ensure the resilience, security, and accessibility of various pieces of personal data. There are several different categories, and no solution will be optimal for all of them. My music collection, for example, is large, non-secret, and largely replaceable (although the thought of dragging that enormous box of CDs out of the garage and reripping them all is pretty daunting!) The music lives on a server in my home, with my own backups. I upload medium bitrate versions to a cloud music service, and I have a low bitrate copy on my laptop and phone. So that's pretty well covered.

A similar scheme covers my photos and videos. They are much less replaceable than music, but fortunately much smaller, so there are a few extra copies kicking about.

Then, I have a few tiny things that I want to keep in sync across various devices. For example, today's todo list, my "blue skies ideas" list, and my password store. I've looked at syncthing, which is an awesome project, and I'm sure I'm going to find a good use for it someday.

But for these things, git is really the obvious solution. Most of them are already git repos, including my password-store, the only missing piece is a git client on my phone. So I was searching for recommendations for Android git clients, and these words jumped out at me:

create a debian image, mount it in your android device and chroot to it

My flabber was well and truly gasted.

How?

It's very straightforward. From some debian instance on which you have root, run:

debootstrap --foreign --arch=armhf jessie jessie

Tar up the resulting tree in jessie, copy it to android, unpack it (ah, but where?), chroot, and then run:

debootstrap --second-stage Which?

Here are some things I've used: ssh, rsync, dash, bash, the rc shell (which I happen to maintain). All the usual userland tools, mv, chmod, etc. These (of course) are the proper full GNU versions, so you don't keep being bitten by the little discrepancies in, for instance, the busybox versions.

Package management with apt-get and dpkg. And perl, git, nano, vim, update-alternatives (so I never have to see nano again), less, man.

I started installing the pass package, but that pulls in gtk, pango and a whole bunch of other things I'm not going to use. So I downloaded password-store and installed it myself.

The ps command (you need to mount /proc in the chroot of course), top, strace, lsof. You can even strace android processes, it all Just Works. (OK, lsof gets upset because it's inside a chroot and it can't see all the mount points that /proc/mounts says exist. But still.)

I thought it might be fun to run mosh. It installed fine, but then bombed out with a weird error. I went on a bit of a wild goose chase, and concluded (it was late at night) that I needed a fix from the development version. So I cloned the mosh repo on github, installed a whole heap of build tools, compilers, libraries, and built mosh. On my phone!

In fact, the problem was simpler than that, and easily solved by using the stty command to declare my terminal size. And then I had to open up some ports in android's firewall... with iptables of course.

I could go on, but you're getting the idea. In summary, this is not something pretending to be a GNU/Linux system. It's the real deal.

Of course, there are some missing pieces, of which the most serious is the lack of daemons. I've installed etckeeper, but there will be no daily autocommit.

Ping doesn't work, because even a root shell is not allowed to use raw sockets. You can create a user, but it's not able to do much... I'll look at this some more when I have time, but I'm just running everything as root at the moment. Android's systems for users, permissions, and capabilities are entirely unfamiliar to me, although I'm trying to learn.

Where?

I made and remade my chroot several times before I was happy with it. Hopefully these notes will make things quicker for you.

First of all, Debian wants a “real” filesystem, which is to say, anything except FAT. Of the existing partitions, an obvious choice would be /data, which on my phone is ext4. Unfortunately, the major major drawback of my phone is that its /data is tiddly, just 1GB, and perennially full. (I did try the chroot on /data, before realising the fatal flaw. One curiosity is that /data is mounted with nodev, so populating /dev fails till you remount without nodev. You might think it would be better to bind mount the real /dev into the chroot anyway, and you might well be right. But I've been running with the /dev made by debootstrap with no problems.)

So it's time to repartition my 32GB SD card. Android apparently doesn't support GPT which is only a minor nuisance. I do hate all that primary / extended / logical nonsense though, it's so 1980s.

Much, much more seriously, it complains bitterly if it finds an SD card without a FAT partition. This is infuriating. The kernel supports ext3 just fine (and ext4 too, at least for partitions on fixed internal storage, although apparently not for the SD card, which makes no sense to me). So, if I insert a card that happens to have an ext3 partition on it, why not just mount it? Or if there's some scenario I'm not aware of that might not work quite right, notify a dialogue that explains and offers to mount the partition anyway. What actually happens is a notification that tells you the SD card is “damaged”, and offers to format it. Arrggh!

(I have reason to believe that the latest versions of Android will countenance SD cards with real file systems, although I need to research this further.)

My next try was a 50MB FAT partition, and the remainder ext3. This largely worked, but it didn't leave anywhere for android to move those apps which are prepared to live on SD card, absolutely vital to squeeze some extra apps onto my old phone.

The final iteration was a 4GB FAT partition, and the rest ext3. Of course, I don't need 28GB for the chroot itself: it starts off well under 1G, and even after installing loads of stuff I'm still under 2G. But I realised that I'd be able to put my music collection on the ext3 partition, which would save the tedious step of renaming everything to comply with FAT restrictions (mainly the prohibition on : in file names). Of course, I can now rsync-over-ssh the music from my laptop, which seems to go quicker than via USB.

Another annoyance is that the ext3 partition on the SD card isn't automatically mounted. I've spent some time in the past trying to find a boot time hook I can use, but with no luck. So I have to do this from the android shell every time my phone reboots, using a helper script cunningly located under the mount point:

root@android:/ # cat /data/ext3/m #!/system/bin/sh mount -t ext3 /dev/block/mmcblk1p5 /data/ext3 What?

Far and away the nicest way to communicate with the chroot is to plug into a laptop or desktop and use adb shell from the Android SDK. At that point, it's scarcely different from sshing to a remote server.

Of course, the whole point of the phone is that it's portable. On the move, I'm using Jack Palevich's Terminal Emulator for Android and Klaus Weidner's Hacker's Keyboard. The keyboard has all the keys you need — Esc, Tab, Ctrl etc — so it's ideal for non-trivial tasks (such as vim!). But the tiny keys are very fiddly on my phone, especially in portrait, so I sometimes stick to my usual keyboard.

I've got a trivial helper script to start my favourite shell under ssh-agent:

root@android:/ # cat /data/ext3/ch #!/system/bin/sh exec chroot /data/ext3/jessie /usr/bin/ssh-agent /usr/bin/rc -l Whither?

So I have a fantastic solution to my document and password management problems. And a great ssh client. And a whole new architecture to build my projects on, most of which aren't the sort of thing that it makes much sense to run on a phone, but building in different places is always good for portability.

I'd heard that Android uses a "modified Linux" kernel, so I wasn't really expecting any of this stuff to work properly, let alone tools like strace and lsof. Apparently, though, the changes were folded back into the mainline kernel at the 3.3 release. My (3 year old) phone runs 3.4.5, so presumably this a fairly vanilla kernel.

This is awesome. Google has its faults, but their commitment to free software easily earns them the “least evil” prize among the current Internet quintumvirate. (That's Google, Apple, Facebook, Amazon, and Microsoft, for anyone who's been asleep the last few years.)

Realising that, yes, that computer in my pocket is a pukka Linux box has endeared me even further to Android. I'd love to write some apps for it... except I've already got more than enough projects to fill my “copious spare time”!

Update November 2016

A couple of new things I've discovered since writing this article.

First, the debootstrap command is available on Fedora, from the system repository! So you don't need a Debian box to build the initial system image. (Having debootstrap around is also handy for making a Debian container on my desktop.)

Secondly, I think the reason Android won't automatically mount an ext2/3/4 partition is that it has no idea how to map UIDs and GIDs on the filesystem. Any obvious trivial solution, such as “only UID 0 can read or write”, would make it inaccessible to Android processes. Remember, you're not supposed to root your Android device! I've just ordered my next phone, which is supported by CyanogenMod, so it's likely that I'll end up with an OS that is rather more enlightened about users having full control of their devices. Having said that, I don't believe it has an SD slot, so the issue of real FS support won't arise.

Categories: Offsite Blogs

The haskell-lang.org team: Updates for November 3, 2016

Planet Haskell - Thu, 11/03/2016 - 9:00am

The following changes have been made since September:

Categories: Offsite Blogs

Michael Snoyman: Designing APIs for Extensibility

Planet Haskell - Wed, 11/02/2016 - 6:00pm

This is an old bit of content that I wrote, and I'm relocating to this blog for posterity. I've actually been using this technique in practice to a large extent over the past few libraries with settings I've written, and overall like it. But it's definitely opinionated, your mileage may vary.

Every time you make a breaking change in your API, it means that- potentially- one or more of your users will need to change his/her code to adapt. Even if this update is trivial, it adds friction to the code maintenance process. On the other hand, we don't want to be constrained by bad design choices early on in a project, and sometimes a breaking API change is the best option.

The point of this document, however, is to give you a third option: design your APIs from the outset to be extensible. There are common techniques employed in the Haskell world to make APIs that are resilient to changing feature-sets, and by employing them early on in your design process, you can hopefully avoid the painful choices between a better API and happy users.

Almost all techniques start with implementation hiding. Guidelines here are simple: don't expose anything non-public. For example, if you write a number of helper functions, you may not want to start off by exposing them, since you're then telling users that these are good, stable functions to be relied upon. Instead, use explicit export lists on your modules and only include functions that are intended for public consumption.

More important- and more tricky- than functions are data constructors. In many cases, you want to avoid exposing the internals of your data types to users, to allow you to expand on them in the future. A common use case for this is some kind of a data type providing configuration information. Consider that you're going to communicate with some web services, so you write up the following API:

module MyAPI ( Settings (..) , makeAPICall ) where data Settings = Settings { apiKey :: Text , hostName :: Text } makeAPICall :: Settings -> Foo -> IO Bar

The way your users will access this will be something like:

makeAPICall Settings { apiKey = myAPIKey , hostName = "www.example.com" } myFoo

Now suppose a user points out that, in some cases, the standard port 80 is not used for the API call. So you add a new field port :: Int to your Settings constructor. This will break your user's code, since the port field will not be set.

Instead, a more robust way of specifying your API will look like:

module MyAPI ( Settings , mkSettings , setHostName , makeAPICall ) where data Settings = Settings { apiKey :: Text , hostName :: Text } -- | Create a @Settings@ value. Uses default value for host name. mkSettings :: Text -- ^ API Key -> Settings mkSettings key = Settings { apiKey = key , hostName = "www.example.com" } setHostName :: Text -> Settings -> Settings setHostName hn s = s { hostName = hn } makeAPICall :: Settings -> Foo -> IO Bar

Now your user code will instead look like:

makeAPICall (mkSettings myAPIKey) myFoo

This has the following benefits:

  • The user is not bothered to fill in default values (in this case, the hostname).
  • Extending this API to allow for more fields in the future is trivial: add a new set* function. Internally, you'll add a field to Settings and set a default value in mkSettings.

One thing to note: please do not expose the field accessors directly. If you want to provide getter functions in addition to setters, write them explicitly, e.g.:

getHostName :: Settings -> Text getHostName = hostName

The reason for this is that by exposing field accessors, users will be able to write code such as:

(mkSettings myAPIKey) { hostName = "www.example.org" }

This ties your hand for future internal improvements, since you are now required to keep a field of name hostName with type Text. By just using set and get functions, you can change your internal representation significantly and still provide a compatibility layer.

For those of you familiar with other languages: this is in fact quite similar to the approach taken in Java or C#. Just because Java does it doesn't mean it's wrong.

Note that this advice is different to, and intended to supersede, the settings type approach. Projects like Warp which previously used that settings type approach are currently migrating to this more extensible approach.

Also, while settings have been used here as a motivating example, the same advice applies to other contexts.

Internal modules

One downside of implementation hiding is that it can make it difficult for users to do things you didn't intend for them to do with your API. You can always add more functionality on demand, but the delay can be a major nuissance for users. A compromise solution in the Haskell community is to provide a .Internal module for your project which exports not-quite-public components. For example, in wai, the Response constructors are exposed in a Network.Wai.Internal module. Normally, users are supposed to use smart constructors like responseFile, but occasionally they'll want more fine-grained control.

Categories: Offsite Blogs