News aggregator

Yield for Effects

Haskell on Reddit - Fri, 01/15/2016 - 6:31am
Categories: Incoming News

Well-Typed.Com: Efficient Amortised and Real-Time Queues in Haskell

Planet Haskell - Fri, 01/15/2016 - 4:58am

A queue is a datastructure that provides efficient—O(1)—operations to remove an element from the front of the queue and to insert an element at the rear of the queue. In this blog post we will discuss how we can take advantage of laziness to implement such queues in Haskell, both with amortised and with worst-case O(1) bounds.

The results in this blog post are not new, and can be found in Chris Okasaki’s book “Purely Functional Data Structures”. However, the implementation and presentation here is different from Okasaki’s. In particular, the technique we use for real-time datastructures is more explicit and should scale to datastructures other than queues more easily than Okasaki’s.

Non-solution: Lists

To set the stage, consider this first attempt at implementing queues:

class Queue q where empty :: q a head :: q a -> a tail :: q a -> q a snoc :: q a -> a -> q a data Queue0 a = Q0 [a] instance Queue Queue0 where empty = Q0 [] head (Q0 (x:_ )) = x tail (Q0 (_:xs)) = Q0 xs snoc (Q0 xs ) x = Q0 (xs ++ [x])

What is the complexity of head and snoc in this representation? Your first instinct might be to say that head has O(1) complexity (after all, it doesn’t do anything but a pattern match) and that snoc has O(n) complexity, because it needs to traverse the entire list before it can append the element.

However, Haskell is a lazy language. All that happens when we call snoc is that we create a thunk (a suspended computation), which happens in O(1) time. Consider adding the elements [1..5] into an empty queue, one at a time:

Q0 [] Q0 ([] ++ [1]) Q0 (([] ++ [1]) ++ [2]) Q0 ((([] ++ [1]) ++ [2]) ++ [3]) Q0 (((([] ++ [1]) ++ [2]) ++ [3]) ++ [4]) Q0 ((((([] ++ [1]) ++ [2]) ++ [3]) ++ [4]) ++ [5])

Now when we call head on the resulting queue, (++) needs to traverse this entire chain before it can find the first element; since that chain has O(n) length, the complexity of head is O(n).

Strict, Non-Persistent Queues

Thinking about complexity in a lazy setting can be confusing, so let’s first think about a spine strict queue. In order to define it, we will need a spine-strict list:

data StrictList a = SNil | SCons a !(StrictList a)

A bang annotation here means each evaluating an SCons node to weak-head normal form (for instance by pattern matching on it) will also force its tail to weak head normal form, and hence the entire spine of the list; we cannot have an SCons node with a pointer to an unevaluated tail.

We will also need a few operations on strict lists:

-- | Append two strict lists app :: StrictList a -> StrictList a -> StrictList a app SNil ys = ys app (SCons x xs) ys = SCons x (app xs ys) -- | Reverse a strict list rev :: StrictList a -> StrictList a rev = go SNil where go :: StrictList a -> StrictList a -> StrictList a go acc SNil = acc go acc (SCons x xs) = go (SCons x acc) xs

The definition of strict lists in hand, we can attempt our next queue implementation:

data Queue1 a = Q1 !Int !(StrictList a) !Int !(StrictList a)

Instead of using a single list, we split the queue into two parts: the front of the queue and the rear of the queue. The front of the queue will be stored in normal order, so that we can easily remove elements from the front of the queue; the rear of the queue will be stored in reverse order, so that we can also easily insert new elements at the end of the queue.

In addition, we also record the size of both lists. We will use this to enforce the following invariant:

Queue Invariant: The front of the queue cannot be shorter than the rear.

(Simpler invariants are also possible, but this invariant is the one we will need later so we will use it throughout this blogpost.)

When the invariant is violated, we restore it by moving the elements from the rear of the queue to the front; since the rear of the queue is stored in reverse order, but the front is not, the rear must be reversed:

inv1 :: Queue1 a -> Queue1 a inv1 q@(Q1 f xs r ys) | f < r = Q1 (f+r) (xs `app` rev ys) 0 SNil | otherwise = q

The invariant can be violated when we shrink the front or grow the rear, so we end up with this implementation of the Queue interface:

instance Queue Queue1 where empty = Q1 0 SNil 0 SNil head (Q1 _ (SCons x _ ) _ _ ) = x tail (Q1 f (SCons _ xs) r ys) = inv1 $ Q1 (f-1) xs r ys snoc (Q1 f xs r ys) y = inv1 $ Q1 f xs (r+1) (SCons y ys) Worst-Case versus Amortised Complexity

Since we don’t have to think about laziness, the complexity of this queue implementation is a bit easier to determine. Clearly, head is O(1), and both tail and snoc have worst case O(n) complexity because rev has O(n) complexity. However, consider what happens when we insert [1..7] into an empty queue:

Q1 0 [] 0 [] Q1 1 [1] 0 [] -- invariant restored Q1 1 [1] 1 [2] Q1 3 [1..3] 0 [] -- invariant restored Q1 3 [1..3] 1 [4] Q1 3 [1..3] 2 [5,4] Q1 3 [1..3] 3 [6,5,4] Q1 7 [1..7] 0 [] -- invariant restored

Notice what happens: we only need to reverse n elements after having inserted n elements; we therefore say that the amortised complexity (the complexity averaged over all operations) of the reverse is in fact O(1)—with one proviso, as we shall see in the next section.

Amortisation versus Persistence

The analysis in the previous section conveniently overlooked one fact: since values are immutable in Haskell, nothing is stopping us from reusing a queue multiple times. For instance, if we started from

Q1 3 [1..3] 3 [6,5,4]

we might attempt to insert 7, then 8, then 9, and finally 10 into this (same) queue:

Q1 7 [1,2,3,4,5,6,7] 0 [] -- invariant restored Q1 7 [1,2,3,4,5,6,8] 0 [] -- invariant restored Q1 7 [1,2,3,4,5,6,9] 0 [] -- invariant restored Q1 7 [1,2,3,4,5,6,10] 0 [] -- invariant restored

Notice that each of these single insertions incurs the full cost of a reverse. Thus, claiming an amortised O(1) complexity is only valid if we use the queue linearly (i.e., never reusing queues). If we want to lift this restriction, we need to take advantage of laziness.

Amortised Complexity for Persistent Queues

In order to get amortised constant time bounds even when the queue is not used linearly, we need to take advantage of lazy evaluation. We will change the front of the queue back to be a lazy list:

data Queue2 a = Q2 !Int [a] !Int !(StrictList a)

The remainder of the implementation is the same as it was for Queue1, except that reverse now needs to take a strict list as input and return a lazy list as result:

rev' :: StrictList a -> [a] rev' = go [] where go :: [a] -> StrictList a -> [a] go acc SNil = acc go acc (SCons x xs) = go (x:acc) xs

All the other changes are just changing the operations on strict lists to operations on lazy lists:

inv2 :: Queue2 a -> Queue2 a inv2 q@(Q2 f xs r ys) | f < r = Q2 (f+r) (xs ++ rev' ys) 0 SNil | otherwise = q instance Queue Queue2 where empty = Q2 0 [] 0 SNil head (Q2 _ (x:_ ) _ _ ) = x tail (Q2 f (_:xs) r ys) = inv2 $ Q2 (f-1) xs r ys snoc (Q2 f xs r ys) y = inv2 $ Q2 f xs (r+1) (SCons y ys)

The genius of this representation lies in two facts. First, notice that when we construct the thunk (xs ++ rev' ys), we know that the rev' ys will not be forced until we have exhausted xs. Since we construct this thunk only when the rear is one longer than the front, we are indeed justified in saying that the cost of the reverse is amortised O(1).

But what about reusing the same queue twice? This is where we rely crucially on laziness. Suppose we have a sequence of operations

Q2 4 [1,2,3,4] 4 [8,7,6,5] -- initial queue Q2 9 ([1..4] ++ rev' [9,8,7,6,5]) 0 [] -- snoc (invariant restored) Q2 5 (rev' [9,8,7,6,5]) 0 [] -- tail 4 times

While it is true that we might call tail on this resulting queue any number of times, they will not each incur the full cost of rev': since these thunks will all be shared, laziness will make sure that once this rev' has been evaluated (“forced”) once, it will not be forced again.

Of course, if we started from that initial queue and inserted various elements, then each of those would create a separate (not shared) thunk with a call to rev': but those calls to rev' will only be forced if for each of those separate queues we first do f calls to tail (in this case, 4 calls).

From Amortised to Worst-Case Bounds

The queues from the previous section will suffice for lots of applications. However, in some applications amortised complexity bounds are not good enough. For instance, in real time systems having normally-cheap operations occassionally take a long time is not acceptable; each operation should take approximately the same amount of time, even if that means that the overall efficiency of the system is slightly lower.

There are two sources of delays in the implementation from the previous section. The first is that when we come across the call to reverse, that whole reverse needs to happen in one go. The second source comes from the fact that we might still chain calls to append; consider what happens when we insert the elements [1..7]:

Q2 0 [] 0 [] Q2 1 r1 0 [] -- invariant restored, r1 = [] ++ rev' [1] Q2 1 r1 1 [2] Q2 3 r2 0 [] -- invariant restored, r2 = r1 ++ rev' [3,2] Q2 3 r2 1 [4] Q2 3 r2 2 [5,4] Q2 3 r2 3 [6,5,4] Q2 7 r3 0 [] -- invariant restored, r3 = r2 ++ rev' [7,6,5,4]

This is similar to the behaviour we saw for the queues based on a single list, except we now have a maximum of O(log n) calls rather than O(n), because the distance between two calls to reverse doubles each time.

Intuitively, we can solve both of these problems by doing a little bit of the append and a little bit of the reverse each time we call tail or snoc. We need to reestablish the invariant when r = f + 1. At this point the append will take f steps, and the reverse r steps, and we will not need to reestablish the invariant again until we have added r + f + 2 elements to the rear of the queue (or added some to the rear and removed some from the front). This therefore gives us plenty of time to do the append and the reverse, if we take one step on each call to tail and snoc.

Progress

How might we “do one step of a reverse”? This is where we diverge from Okasaki, and give a more direct implementation of this idea. We can implement a datatype that describes the “progress” of an operation:

data Progress = Done | NotYet Progress

The idea is that we can execute one step of an operation by pattern matching on an appropriate value of type Progress:

step :: Progress -> Progress step Done = Done step (NotYet p) = p

For (++) it is easy to construct a Progress value which will execute the append; all we need to do is force (part of) the spine of the resulting list:

forceSpine :: Int -> [a] -> Progress forceSpine 0 _ = Done forceSpine _ [] = Done forceSpine n (_:xs) = NotYet (forceSpine (n-1) xs)

For other operations this is more difficult. We need some way to express a computation split into multiple steps. We can use the following datatype for this purpose:

data Delay a = Now a | Later (Delay a)

Delay a is a computation of an a, but we mark the various steps of the computation using the Later constructor (this datatype is variously known as the delay monad or the partiality monad, but we will not need the fact that it is a monad in this blog post). For example, here is reverse:

revDelay :: StrictList a -> Delay [a] revDelay = go [] where go :: [a] -> StrictList a -> Delay [a] go acc SNil = Now acc go acc (SCons x xs) = Later $ go (x:acc) xs

We then need to be able to execute one step of such a computation. For this purpose we can introduce

runDelay :: Delay a -> (a, Progress)

which returns the final value, as well as a Progress value which allows us to execute the computation step by step. The definition of runDelay is somewhat difficult (see appendix, below), but the idea hopefully is clear: evaluating the resulting Progress n steps will execute precisely n steps of the computation; if you look at the resulting a value before having stepped the entire Progress the remainder of the computation will run at that point.

Finally, we can execute two operations in lockstep by pattern matching on two Progress values at the same time:

par :: Progress -> Progress -> Progress par !p Done = p par Done !p' = p' par (NotYet p) (NotYet p') = NotYet (par p p') Real-Time Queues

We can use the Progress datatype to implement real-time queues: queues where both insertion and deletion has O(1) worst case complexity. The representation is much like we used in the previous section, but we add a Progress field (Progress is an example implementation of what Okasaki calls a “schedule”):

data Queue3 a = Q3 !Int [a] !Int !(StrictList a) !Progress

Re-establishing the invariant happens much as before, except that we record the resulting Progress on the queue:

inv3 :: Queue3 a -> Queue3 a inv3 q@(Q3 f xs r ys _) | f < r = let (ys', p1) = runDelay $ revDelay ys xs' = xs ++ ys' p2 = forceSpine f xs' in Q3 (f+r) xs' 0 SNil (par p1 p2) | otherwise = q

All that is left to do now is make sure we take a step of the background reverse and append actions on each call to tail and snoc:

instance Queue Queue3 where empty = Q3 0 [] 0 SNil Done head (Q3 _ (x:_ ) _ _ _) = x tail (Q3 f (_:xs) r ys p) = inv3 $ Q3 (f-1) xs r ys (step p) snoc (Q3 f xs r ys p) y = inv3 $ Q3 f xs (r+1) (SCons y ys) (step p) Conclusions

It is difficult to develop data structures with amortised complexity bounds in strict but pure languages; laziness is essential for making sure that operations don’t unnecessarily get repeated. For applications where amortised bounds are insufficient, we can use an explicit schedule to make sure that operations get executed bit by bit; we can use this to develop a pure and persistent queue with O(1) insertion and deletion.

In his book, Okasaki does not introduce a Progress datatype or any of its related functionality; instead he makes very clever use of standard datatypes to get the same behaviour somehow implicitly. Although this is very elegant, it also requires a lot of ingenuity and does not immediately suggest how to apply the same techniques to other datatypes. The Progress datatype we use here is perhaps somewhat cruder, but it might make it easier to implement other real-time data structures.

Random access to (any of the variations on) the queue we implemented is still O(n); if you want a datastructure that provides O(1) insertion and deletion as well as O(log n) random access you could have a look at Data.Sequence; be aware however that this datatype provides amortised, not real-time bounds. Modifying Sequence to provide worst-case complexity bounds is left an exercise for the reader ;-)

Appendix: Implementation of runDelay

The definition of runDelay is tricky. The most elegant way we have found is to use the lazy ST monad:

runDelay :: Delay a -> (a, Progress) runDelay = \xs -> runST $ do r <- newSTRef xs x <- unsafeInterleaveST $ readSTRef r p <- next r return (runNow x, p) where next :: STRef s (Delay a) -> ST s Progress next r = do xs <- readSTRef r case xs of Now _ -> return Done Later d -> do writeSTRef r d p' <- next r return $ NotYet p' runNow :: Delay a -> a runNow (Now a) = a runNow (Later d) = runNow d

In the lazy ST monad effects are only executed when their results are demanded, but are always executed in the same order. We take advantage of this to make sure that the calls to next only happen when pattern matching on the resulting Progress value. However, it is crucial that for the value of x we read the contents of the STRef only when the value of x is demanded, so that we can take advantage of any writes that next will have done in the meantime.

This does leave us with a proof obligation that this code is safe; in particular, that the value of x that we return does not depend on when we execute this readSTRef; in other words, that invoking next any number of times does not change this value. However, hopefully this is relatively easy to see. Indeed, it follows from parametricity: since runDelay is polymorphic in a, the only a it can return is the one that gets passed in.

To see that pattern matching on the resulting Progress has the intended effect, note that the ST ref starts with “cost n”, where n is the number of Later constructors, and note further that each call to next reduces n by one. Hence, by the time we reach Done, the computation has indeed been executed (reached the Now constructor).

Note that for the case of the queue implementation, by the time we demand the value of the reversed list, we are sure that we will have fully evaluated it, so the definition

runNow (Later d) = runNow d

could actually be replaced by

runNow (Later _) = error "something went horribly wrong!"

Indeed, this can be used to debug designing these real time data structures to ensure that things are indeed fully evaluated by the time you expect them to. In general however it makes the runDelay combinator somewhat less general, and strictly speaking it also breaks referential transparency because now the value of x does depend on how much of the Progress value you evaluate.

For more information about the (lazy) ST monad, see Lazy Functional State Threads, the original paper introducing it. Section 7.2, “Interleaved and parallel operations” discusses unsafeInterleaveST.

Categories: Offsite Blogs

Using Lua with Haskell

Haskell on Reddit - Fri, 01/15/2016 - 1:58am

I'm looking to write a program with some parts written in Haskell and others written in Lua.

Is the HsLua package generally well regarded, maintained, compatible, etc.?

If not -- or even if so -- does anyone have any suggestions for other ways to allow Haskell code & Lua code to work together?

P.S. In answer to the obvious question ("For goodness sake, why???"): I'm a computer science prof. This spring I'm teaching a programming languages class, and I'm preparing an assignment that requires both Haskell & Lua to be used.

submitted by ggchappell
[link] [10 comments]
Categories: Incoming News

Is there an intuitive interface for MySQL?

Haskell on Reddit - Thu, 01/14/2016 - 10:09pm

I'm wondering if there is a database library that interfaces with MySQL that doesn't feel like it has a lot of boilerplate and doesn't require TemplateHaskell?

I particularly like postgresql-simple's Applicative interface and ToRow/FromRow type classes, but there doesn't quite seem to be an analog for mysql-simple. The closest I've gotten was this:

data Role = Role { id :: Int , company_id :: Int , name :: String } deriving (Show) instance QueryResults Role where convertResults [fa, fb, fc] [va, vb, vc] = Role (a :: Int) b c where a = convert fa va b = convert fb vb c = convert fc vc connectInfo :: ConnectInfo connectInfo = defaultConnectInfo { connectUser = "root" , connectPassword = "test" , connectDatabase = "db" } connection :: IO Connection connection = connect connectInfo testRole :: IO Role testRole = do conn <- connection [role] <- query_ conn "select * from myjobs_role" return role main :: IO () main = do role <- testRole print role

It still feels rather clunky and repetitive, though. Any hints at how to either cut down the above code (Maybe even generics?) or links to alternative approaches would be great.

If anyone is curious as to why I'm needing the solution to be MySQL based, it's because we it's what we are using at work. If I can figure this little puzzle out, I'd like to eventually write a little service in Haskell for some of our data and compare the development experience to things we've done with our current stack (Django 1.6.5, Python 2.7, MySQL).

Thanks in advance.

submitted by emarshall85
[link] [15 comments]
Categories: Incoming News

Chris Smith: Help 25 to 30 girls learn computer programming in Mountain View, CA!

Planet Haskell - Thu, 01/14/2016 - 9:35pm

If you live around Mountain View, CA, and love kids, programming, and/or Haskell, read on!

Empoder is a non-profit organization established to teach computer science to underprivileged kids.  They are looking for volunteers to help with a coding club called Empower Girls Through Code, at Graham Middle School in Mountain View, CA.  This will be 25 to 30 girls, of middle school ages.  The club is led by a teacher at Graham, and we have some teaching assistants already there to help.  Empoder would like a couple more volunteers, to make sure there are enough people to give one-on-one help when it’s needed.

  • Who: A teacher, some fellow TAs, you, and 25 to 30 middle school girls excited about learning to code.
  • Where: Graham Middle School, Mountain View, CA
  • When: Wednesdays, 7:50 to 9:20 am, starting January 27
  • Why: Because it’s amazing… easily the most fun thing I have ever done.

The class will use CodeWorld, a web-based programming environment using a dialect of Haskell.  But you don’t need to know that to volunteer.  We can all learn together.

Hope to see you there!  If interested, email marissa.yanez@empoder.org.


Categories: Offsite Blogs

Comonad Reader: Adjoint Triples

Planet Haskell - Thu, 01/14/2016 - 7:16pm

A common occurrence in category theory is the adjoint triple. This is a pair of adjunctions relating three functors:

F ⊣ G ⊣ H F ⊣ G, G ⊣ H

Perhaps part of the reason they are so common is that (co)limits form one:

colim ⊣ Δ ⊣ lim

where Δ : C -> C^J is the diagonal functor, which takes objects in C to the constant functor returning that object. A version of this shows up in Haskell (with some extensions) and dependent type theories, as:

∃ ⊣ Const ⊣ ∀ Σ ⊣ Const ⊣ Π

where, if we only care about quantifying over a single variable, existential and sigma types can be seen as a left adjoint to a diagonal functor that maps types into constant type families (either over * for the first triple in Haskell, or some other type for the second in a dependently typed language), while universal and pi types can be seen as a right adjoint to the same.

It's not uncommon to see the above information in type theory discussion forums. But, there are a few cute properties and examples of adjoint triples that I haven't really seen come up in such contexts.

To begin, we can compose the two adjunctions involved, since the common functor ensures things match up. By calculating on the hom definition, we can see:

Hom(FGA, B) Hom(GFA, B) ~= ~= Hom(GA, GB) Hom(FA, HB) ~= ~= Hom(A, HGB) Hom(A, GHB)

So there are two ways to compose the adjunctions, giving two induced adjunctions:

FG ⊣ HG, GF ⊣ GH

And there is something special about these adjunctions. Note that FG is the comonad for the F ⊣ G adjunction, while HG is the monad for the G ⊣ H adjunction. Similarly, GF is the F ⊣ G monad, and GH is the G ⊣ H comonad. So each adjoint triple gives rise to two adjunctions between monads and comonads.

The second of these has another interesting property. We often want to consider the algebras of a monad, and coalgebras of a comonad. The (co)algebra operations with carrier A have type:

alg : GFA -> A coalg : A -> GHA

but these types are isomorphic according to the GF ⊣ GH adjunction. Thus, one might guess that GF monad algebras are also GH comonad coalgebras, and that in such a situation, we actually have some structure that can be characterized both ways. In fact this is true for any monad left adjoint to a comonad; [0] but all adjoint triples give rise to these.

The first adjunction actually turns out to be more familiar for the triple examples above, though. (Edit: [2]) If we consider the Σ ⊣ Const ⊣ Π adjunction, where:

Σ Π : (A -> Type) -> Type Const : Type -> (A -> Type)

we get:

ΣConst : Type -> Type ΣConst B = A × B ΠConst : Type -> Type ΠConst B = A -> B

So this is the familiar adjunction:

A × - ⊣ A -> -

But, there happens to be a triple that is a bit more interesting for both cases. It refers back to categories of functors vs. bare type constructors mentioned in previous posts. So, suppose we have a category called Con whose objects are (partially applied) type constructors (f, g) with kind * -> *, and arrows are polymorphic functions with types like:

  forall x. f x -> g x  

And let us further imagine that there is a similar category, called Func, except its objects are the things with Functor instances. Now, there is a functor:

U : Func -> Con

that 'forgets' the functor instance requirement. This functor is in the middle of an adjoint triple:

F ⊣ U ⊣ C F, C : Con -> Func

where F creates the free functor over a type constructor, and C creates the cofree functor over a type constructor. These can be written using the types:

  data F f a = forall e. F (e -> a) (f e) newtype C f a = C (forall r. (a -> r) -> f r)  

and these types will also serve as the types involved in the composite adjunctions:

FU ⊣ CU : Func -> Func UF ⊣ UC : Con -> Con

Now, CU is a monad on functors, and the Yoneda lemma tells us that it is actually the identity monad. Similarly, FU is a comonad, and the co-Yoneda lemma tells us that it is the identity comonad (which makes sense, because identity is self-adjoint; and the above is why F and C are often named (Co)Yoneda in Haskell examples).

On the other hand, UF is a monad on type constructors (note, U isn't represented in the Haskell types; F and C just play triple duty, and the constraints on f control what's going on):

  eta :: f a -> F f a eta = F id   transform :: (forall x. f x -> g x) -> F f a -> F g a transform tr (F g x) = F g (tr x)   mu :: F (F f) a -> F f a mu (F g (F h x)) = F (g . h) x  

and UC is a comonad:

  epsilon :: C f a -> f a epsilon (C e) = e id   transform' :: (forall x. f x -> g x) -> C f a -> C g a transform' tr (C e) = C (tr . e)   delta :: C f a -> C (C f) a delta (C e) = C $ \h -> C $ \g -> e (g . h)  

These are not the identity (co)monad, but this is the case where we have algebras and coalgebras that are equivalent. So, what are the (co)algebras? If we consider UF (and unpack the definitions somewhat):

  alg :: forall e. (e -> a, f e) -> f a alg (id, x) = x alg (g . h, x) = alg (g, alg (h, x))  

and for UC:

  coalg :: f a -> forall r. (a -> r) -> f r coalg x id = x coalg x (g . h) = coalg (coalg x h) g  

in other words, (co)algebra actions of these (co)monads are (mangled) fmap implementations, and the commutativity requirements are exactly what is required to be a law abiding instance. So the (co)algebras are exactly the Functors. [1]

There are, of course, many other examples of adjoint triples. And further, there are even adjoint quadruples, which in turn give rise to adjoint triples of (co)monads. Hopefully this has sparked some folks' interest in finding and studying more interesting examples.

[0]: Another exmaple is A × - ⊣ A -> - where the A in question is a monoid. (Co)monad (co)algebras of these correspond to actions of the monoid on the carrier set.

[1]: This shouldn't be too surprising, because having a category of (co)algebraic structures that is equivalent to the category of (co)algebras of the (co)monad that comes from the (co)free-forgetful adjunction is the basis for doing algebra in category theory (with (co)monads, at least). However, it is somewhat unusual for a forgetful functor to have both a left and right adjoint. In many cases, something is either algebraic or coalgebraic, and not both.

[2]: Urs Schreiber informed me of an interesting interpretation of the ConstΣ ⊣ ConstΠ adjunction. If you are familiar with modal logic and the possible worlds semantics thereof, you can probably imagine that we could model it using something like P : W -> Type, where W is the type of possible worlds, and propositions are types. Then values of type Σ P demonstrate that P holds in particular worlds, while values of type Π P demonstrate that it holds in all worlds. Const turns these types back into world-indexed 'propositions,' so ConstΣ is the possibility modality and ConstΠ is the necessity modality.

Categories: Offsite Blogs

Dominic Steinitz: Inferring Parameters for ODEs using Stan

Planet Haskell - Thu, 01/14/2016 - 6:02pm
Introduction

The equation of motion for a pendulum of unit length subject to Gaussian white noise is

We can discretize this via the usual Euler method

where and

The explanation of the precise form of the covariance matrix will be the subject of another blog post; for the purpose of exposition of using Stan and, in particular, Stan’s ability to handle ODEs, this detail is not important.

Instead of assuming that we know let us take it to be unknown and that we wish to infer its value using the pendulum as our experimental apparatus.

Stan is a probabilistic programming language which should be welll suited to perform such an inference. We use its interface via the R package rstan.

A Stan and R Implementation

Let’s generate some samples using Stan but rather than generating exactly the model we have given above, instead let’s solve the differential equation and then add some noise. Of course this won’t quite give us samples from the model the parameters of which we wish to estimate but it will allow us to use Stan’s ODE solving capability.

Here’s the Stan

functions { real[] pendulum(real t, real[] y, real[] theta, real[] x_r, int[] x_i) { real dydt[2]; dydt[1] <- y[2]; dydt[2] <- - theta[1] * sin(y[1]); return dydt; } } data { int<lower=1> T; real y0[2]; real t0; real ts[T]; real theta[1]; real sigma[2]; } transformed data { real x_r[0]; int x_i[0]; } model { } generated quantities { real y_hat[T,2]; y_hat <- integrate_ode(pendulum, y0, t0, ts, theta, x_r, x_i); for (t in 1:T) { y_hat[t,1] <- y_hat[t,1] + normal_rng(0,sigma[1]); y_hat[t,2] <- y_hat[t,2] + normal_rng(0,sigma[2]); } }

And here’s the R to invoke it

library(rstan) library(mvtnorm) qc1 = 0.0001 deltaT = 0.01 nSamples = 100 m0 = c(1.6, 0) g = 9.81 t0 = 0.0 ts = seq(deltaT,nSamples * deltaT,deltaT) bigQ = matrix(c(qc1 * deltaT^3 / 3, qc1 * deltaT^2 / 2, qc1 * deltaT^2 / 2, qc1 * deltaT ), nrow = 2, ncol = 2, byrow = TRUE ) samples <- stan(file = 'Pendulum.stan', data = list ( T = nSamples, y0 = m0, t0 = t0, ts = ts, theta = array(g, dim = 1), sigma = c(bigQ[1,1], bigQ[2,2]), refresh = -1 ), algorithm="Fixed_param", seed = 42, chains = 1, iter =1 )

We can plot the angle the pendulum subtends to the vertical over time. Note that this is not very noisy.

s <- extract(samples,permuted=FALSE) plot(s[1,1,1:100])

Now let us suppose that we don’t know the value of and we can only observe the horizontal displacement.

zStan <- sin(s[1,1,1:nSamples])

Now we can use Stan to infer the value of .

functions { real[] pendulum(real t, real[] y, real[] theta, real[] x_r, int[] x_i) { real dydt[2]; dydt[1] <- y[2]; dydt[2] <- - theta[1] * sin(y[1]); return dydt; } } data { int<lower=1> T; real y0[2]; real z[T]; real t0; real ts[T]; } transformed data { real x_r[0]; int x_i[0]; } parameters { real theta[1]; vector<lower=0>[1] sigma; } model { real y_hat[T,2]; real z_hat[T]; theta ~ normal(0,1); sigma ~ cauchy(0,2.5); y_hat <- integrate_ode(pendulum, y0, t0, ts, theta, x_r, x_i); for (t in 1:T) { z_hat[t] <- sin(y_hat[t,1]); z[t] ~ normal(z_hat[t], sigma); } }

Here’s the R to invoke it.

estimates <- stan(file = 'PendulumInfer.stan', data = list ( T = nSamples, y0 = m0, z = zStan, t0 = t0, ts = ts ), seed = 42, chains = 1, iter = 1000, warmup = 500, refresh = -1 ) e <- extract(estimates,pars=c("theta[1]","sigma[1]","lp__"),permuted=TRUE)

This gives an estiamted valeu for of 9.809999 which is what we would hope.

Now let’s try adding some noise to our observations.

set.seed(42) epsilons <- rmvnorm(n=nSamples,mean=c(0.0),sigma=bigR) zStanNoisy <- sin(s[1,1,1:nSamples] + epsilons[,1]) estimatesNoisy <- stan(file = 'PendulumInfer.stan', data = list (T = nSamples, y0 = m0, z = zStanNoisy, t0 = t0, ts = ts ), seed = 42, chains = 1, iter = 1000, warmup = 500, refresh = -1 ) eNoisy <- extract(estimatesNoisy,pars=c("theta[1]","sigma[1]","lp__"),permuted=TRUE)

This gives an estiamted value for of 8.5871024 which is ok but not great.

Postamble

To build this page, download the relevant files from github and run this in R:

library(knitr) knit('Pendulum.Rmd')

And this from command line:

pandoc -s Pendulum.md --filter=./Include > PendulumExpanded.html
Categories: Offsite Blogs

Magnus Therning: Free play, part two

Planet Haskell - Thu, 01/14/2016 - 6:00pm

This post on Free play builds on my previous one. It was John A De Goes’ post that finally got me started on playing with Free and his post contian the following:

Moreover, not only do algebras compose (that is, if you have algebras f and g, you can compose them into a composite algebra Coproduct f g for some suitable definition of Coproduct), but interpreters also compose — both horizontally and vertically.

And a little bit later he offers a few types and some details, but not all details. How could something like that look in Haskell?

Starting from the code in the previous post I first created a new type for a logging action

data LogF a = Log String a

This type has to be a Functor

instance Functor LogF where fmap f (Log s a) = Log s (f a)

The logging should basically decorate a SimpleFileF a action, so I need a function to map one into a Free LogF a

logSimpleFileI :: SimpleFileF a -> Free LogF () logSimpleFileI (LoadFile fp _) = liftF $ Log ("** load file " ++ fp) () logSimpleFileI (SaveFile fp _ _) = liftF $ Log ("** save file " ++ fp) ()

Now I needed a Coproduct for Functor. Searching hackage only offered up one for Monoid (in monoid-extras) so I first translated one from PureScript, but later I got some help via Twitter and was pointed to two in Haskell, Data.Functor.Coproduct from comonad and Data.Functor.Sum from transformers, I decided on the one from transformers because of its shorter name and the fact that it was very different from my translated-from-PureScript version.

Following John’s example I use Applicative to combine the logging with the file action

loggingSimpleFileI :: SimpleFileF a -> Free (Sum LogF SimpleFileF) a loggingSimpleFileI op = toLeft (logSimpleFileI op) *> toRight (liftF op)

with toLeft and toRight defined like this

toLeft :: (Functor f, Functor g) => Free f a -> Free (Sum f g) a toLeft = hoistFree InL toRight :: (Functor f, Functor g) => Free g a -> Free (Sum f g) a toRight = hoistFree InR

With all of this in place I can decorate the program from the last post like this foldFree loggingSimpleFileI (withSimpleF toUpper "FreePlay.hs"). What’s left is a way to run it. The function for that is a natural extension of runsimpleFile

runLogging :: Free (Sum LogF SimpleFileF) a -> IO a runLogging = foldFree f where f :: (Sum LogF SimpleFileF) a -> IO a f (InL op) = g op f (InR op) = h op g :: LogF a -> IO a g (Log s a)= putStrLn s >> return a h :: SimpleFileF a -> IO a h (LoadFile fp f') = liftM f' $ readFile fp h (SaveFile fp d r) = writeFile fp d >> return r

Running the decorated program

runLogging $ foldFree loggingSimpleFileI (withSimpleF toUpper "FreePlay.hs")

does indeed result in the expected output

** load file FreePlay.hs ** save file FreePlay.hs_new

and the file FreePlay.hs_new contains only uppercase letters.

My thoughts

This ability to decorate actions (or compose algebras) is very nice. There’s probably value in the “multiple interpreters for a program” in some domains, but I have a feeling that it could be a hard sell. However, combining it with this kind of composability adds quite a bit of value in my opinion. I must say I don’t think my code scales very well for adding more decorators (composing more algebras), but hopefully some type wizard can show me a way to improve on that.

The code above is rather crude though, and I have another version that cleans it up quite a bit. That’ll be in the next post.

Categories: Offsite Blogs

Neil Mitchell: A simple Haskell function

Planet Haskell - Thu, 01/14/2016 - 5:03pm

Summary: An example of a small function I recently wrote - from type signature to tests.

When writing a build system there are lots of nasty corner cases to consider. One is that command line limits combined with lots of arguments sometimes requires splitting a single command up into multiple commands, each of which is under some maximum length. In this post I'll describe a function that was required, my implementation, and how I tested it.

Type signature and documentation

Before I even got to the function, it already had a type signature and some Haddock documentation:

-- | @chunksOfSize size strings@ splits a given list of strings into chunks not
-- exceeding @size@ characters. If that is impossible, it uses singleton chunks.
chunksOfSize :: Int -> [String] -> [[String]]

As an example:

chunksOfSize 5 ["this","is","a","test"] == [["this"],["is","a"],["test"]]
Implementation

My implementation was:

chunksOfSize n = repeatedly $ \xs ->
let i = length $ takeWhile (<= n) $ scanl1 (+) $ map length xs
in splitAt (max 1 i) xs

First we use the repeatedly function from the extra library. This has the signature:

repeatedly :: ([a] -> (b, [a])) -> [a] -> [b]

Given a list of input, you supply a function that splits off an initial piece and returns the rest. One of the examples in the documentation is:

repeatedly (splitAt 3) xs == chunksOf 3 xs

So we can see how repeatedly lets us focus on just the "next step" of this list, ignoring the recursion. For the function argument we have two tasks - first decide how many items to put in this chunk, then to split the chunks. Splitting the chunks is the easy bit, and can be written:

splitAt (max 1 i) xs

If we know the next i elements will be at or below the limit, then we can use splitAt to divide the elements. As a special case, if no elements would be allowed, we allow one, using max 1 to ensure we never pass 0 to splitAt (and thus enter an infinite loop). That leaves us with:

i = length $ takeWhile (<= n) $ scanl1 (+) $ map length xs

Reading from right to left, we reduce each element to it's length, then use scanl1 to produce a running total - so each element represents the total length up to that point. We then use takeWhile (<= n) to keep grabbing elements while they are short enough, and finally length to convert back to something we can use with splitAt.

Tests

When testing, I tend to start with a few concrete examples then move on to QuickCheck properties. As an initial example we can do:

quickCheck $
chunksOfSize 3 ["a","b","c","defg","hi","jk"] ==
[["a","b","c"],["defg"],["hi"],["jk"]]

Here we are explicitly testing some of the corner cases - we want to make sure the full complement of 3 get into the first chunk (and we haven't got an off-by-one), we also test a singleton chunk of size 4. Now we move on to QuickCheck properties:

quickCheck $ \n xs ->
let res = chunksOfSize n xs
in concat res == xs &&
all (\r -> length r == 1 || length (concat r) <= n) res

There are really two properties here - first, the chunks concat together to form the original. Secondly, each chunk is either under the limit or a singleton. These properties capture the requirements in the documentation.

A final property we can check is that it should never be possible to move the first piece from a chunk to the previous chunk. We can write such a property as:

all (> n) $ zipWith (+)
(map (sum . map length) res)
(drop 1 $ map (length . head) res)

This property isn't as important as the other invariants, and is somewhat tested in the example, so I didn't include it in the test suite.

Performance and alternatives

The complexity is O(n) in the number of Char values, which is as expected, since we have to count them all. Some observations about this point in the design space:

  • In a strict language this would be an O(n^2) implementation, since we would repeatedly length and scanl the remainder of the tail each time. As it is, we are calling length on the first element of each chunk twice, so there is minor constant overhead.
  • Usually in Haskell, instead of counting the number of elements and then doing splitAt we would prefer to use span - something like span ((<= n) . fst) .... While possible, it makes the special singleton case more difficult, and requires lots of tuples/contortions to associate each element with its rolling sum.
  • For a build system, the entire input will be evaluated before, and the entire output will be kept in memory afterwards. However, if we think about this program with lazy streaming inputs and outputs, it will buffer each element of the output list separately. As a result memory would be bounded by the maximum of the longest string and the Int argument to chunksOfSize.
  • It is possible to write a streaming version of this function, which returns each String as soon as it is consumed, with memory bounded by the longest string alone. Moreover, if the solution above was to use lazy naturals, it would actually come quite close to being streaming (albeit gaining a quadratic complexity term from the takeWhile (<= n)).
  • The type signature could be generalised to [a] instead of String, but I would suspect in this context it's more likely for String to be replaced by Text or ByteString, rather than to be used on [Bool]. As a result, sticking to String seems best.
Refactoring the previous version

The function already existed in the codebase I was working on, so below is the original implementation. This implementation does not handle the long singleton special case (it loops forever). We can refactor it to support the singleton case, which we do in several steps. The original version was:

chunksOfSize _ [] = []
chunksOfSize size strings = reverse chunk : chunksOfSize size rest
where
(chunk, rest) = go [] 0 strings
go res _ [] = (res, [])
go res chunkSize (s:ss) =
if newSize > size then (res, s:ss) else go (s:res) newSize ss
where
newSize = chunkSize + length s

Refactoring to use repeatedly we get:

chunksOfSize size = repeatedly $ second reverse . go [] 0
where
go res _ [] = (res, [])
go res chunkSize (s:ss) =
if newSize > size then (res, s:ss) else go (s:res) newSize ss
where
newSize = chunkSize + length s

Changing go to avoid the accumulator we get:

chunksOfSize size = repeatedly $ go 0
where
go _ [] = ([], [])
go chunkSize (s:ss) =
if newSize > size then ([], s:ss) else first (s:) $ go newSize ss
where
newSize = chunkSize + length s

It is then reasonably easy to fix the singleton bug:

chunksOfSize size = repeatedly $ \(x:xs) -> first (x:) $ go (length x) xs
where
go _ [] = ([], [])
go chunkSize (s:ss) =
if newSize > size then ([], s:ss) else first (s:) $ go newSize ss
where
newSize = chunkSize + length s

Finally, it is slightly simpler to keep track of the number of characters still allowed, rather than the number of characters already produced:

chunksOfSize size = repeatedly $ \(x:xs) -> first (x:) $ go (size - length x) xs
where
go n (x:xs) | let n2 = n - length x, n2 >= 0 = first (x:) $ go n2 xs
go n xs = ([], xs)

Now we have an alternative version that is maximally streaming, only applies length to each element once, and would work nicely in a strict language. I find the version at the top of this post more readable, but this version is a reasonable alternative.

Acknowledgements: Thanks to Andrey Mokhov for providing the repo, figuring out all the weird corner cases with ar, and distilling it down into a Haskell problem.

Categories: Offsite Blogs

Call for Participation: BOB 2016 (February 19,Berlin)

haskell-cafe - Thu, 01/14/2016 - 11:15am
Quick reminder: the early registration deadline for BOB 2016 is this Sunday! We have some interesting Haskell talks and tutorials at BOB 2016! ================================================================ BOB 2016 Conference "What happens if we simply use what's best?" February 19, 2016 Berlin http://bobkonf.de/2016/ Program: http://bobkonf.de/2016/program.html Registration: http://bobkonf.de/2016/registration.html ================================================================ BOB is the conference for developers, architects and decision-makers to explore technologies beyond the mainstream in software development, and to find the best tools available to software developers today. Our goal is for all participants of BOB to return home with new insights that
Categories: Offsite Discussion

Call for Participation: BOB 2016 (February 19, Berlin)

General haskell list - Thu, 01/14/2016 - 11:15am
Quick reminder: the early registration deadline for BOB 2016 is this Sunday! We have some interesting Haskell talks and tutorials at BOB 2016! ================================================================ BOB 2016 Conference "What happens if we simply use what's best?" February 19, 2016 Berlin http://bobkonf.de/2016/ Program: http://bobkonf.de/2016/program.html Registration: http://bobkonf.de/2016/registration.html ================================================================ BOB is the conference for developers, architects and decision-makers to explore technologies beyond the mainstream in software development, and to find the best tools available to software developers today. Our goal is for all participants of BOB to return home with new insights that
Categories: Incoming News

Hiring: Haskell at Biotech Startup!

haskell-cafe - Thu, 01/14/2016 - 12:37am
Hello haskell-cafe, I'm an engineer at Karius, a "stealth-mode" biotech startup in Menlo Park, CA, and we're looking to hire a few folks to write software (and we use Haskell!). Currently only hiring locally or with relocation (though that could change in the future, so feel free to get in touch regardless!). We are a team of crazy biologists, engineers, data scientists and clinicians on a mission to change forever the way infectious diseases are diagnosed and treated. We face incredibly interesting challenges in software engineering, machine learning and molecular biology, as we push the limits of diagnostics and genomic technologies. We're hiring computational biologists, software engineers and data scientists. If you're a software engineer, we're looking for experience in front-end, back-end, web development, intrastructure, devops, bioinformatics, and machine learning. We have a varied list of challenges; we build large data processing pipelines to analyze data from in-house DNA sequencers, separ
Categories: Offsite Discussion

Proposal: Add Semigroup and Monoid instances for (:~:)

libraries list - Wed, 01/13/2016 - 7:58am
These are not likely to be *useful*, but they fall easily into the general bucket of "exactly one sensible instance exists". instance Semigroup (a :~: b) where r <> _ = r instance a ~ b => Monoid (a :~: b) where mempty = Refl mappend r _ = r
Categories: Offsite Discussion

GHC.Generics and instances of Functor, Applicative, etc?

libraries list - Wed, 01/13/2016 - 3:28am
I'm using GHC.Generics and have noticed that the data types defined there don't have instances of the standard classes (at least not defined in that module), such as Functor, Applicative, Foldable and Traversable. It'd be very useful to be able to rely on such instances, so that one can define instances of custom types via instances of the generic types. (Hopefully, the compiler can remove most or all conversion overhead) Is there any reason not to add these generic instances?
Categories: Offsite Discussion

Proposal: Add Peano numbers to base

libraries list - Mon, 01/11/2016 - 7:29pm
I have seen these redefined many times now. Examples: https://hackage.haskell.org/package/numericpeano-0.2.0.0/docs/Numeric-Peano.html#t:Nat https://hackage.haskell.org/package/numeric-prelude-0.4.2/docs/Number-Peano.html#t:T https://hackage.haskell.org/package/type-fun-0.0.1/docs/TypeFun-Data-Peano.html#t:N https://hackage.haskell.org/package/number-0.1.1.0/docs/Data-Number-Peano.html#t:Nat https://hackage.haskell.org/package/Peano-0.0.4/docs/Data-Peano.html#t:Peano I often see them used as DataKinds. Too, operations on them can be lazy, which is sometimes useful. I filed a ticket: https://ghc.haskell.org/trac/ghc/ticket/11402 Discussion period: 2 weeks
Categories: Offsite Discussion

Eq1, Ord1, Show1: move from eq1, compare1, showsPrec1 to liftEq, liftCompare, liftShowsPrec

libraries list - Sun, 01/10/2016 - 3:08pm
In transformers-0.5:Data.Functor.Classes class methods like eq1 are replaced by liftEq. With transformers-0.4 I could define: data T a = complicated definition ... deriving (Eq, Ord, Show) instance Eq1 T where eq1 = (==) instance Ord1 T where compare1 = compare instance Show1 T where showsPrec1 = showsPrec In transformers-0.5 it seems that I have to implement Eq1, Ord1, Show1 instances manually. Is there some assistance to define eq1 and compare1 for an ADT? What are the use cases where eq1 was not powerful enough and liftEq is needed?
Categories: Offsite Discussion

New gtk2hs 0.12.4 release

gtk2hs - Wed, 11/21/2012 - 12:56pm

Thanks to John Lato and Duncan Coutts for the latest bugfix release! The latest packages should be buildable on GHC 7.6, and the cairo package should behave a bit nicer in ghci on Windows. Thanks to all!

~d

Categories: Incoming News