News aggregator

How do I get regex-tdfa to work with ByteStrings?

Haskell on Reddit - Tue, 12/30/2014 - 6:27pm

I was trying to get bytestrings to play nice with =~(whose type I've written below all this text), but the types didn't match, so I looked at the type signature. A portion of it was RC.

In Text-Regex-Base-RegexLike.html from Text-Regex-TDFA.html I can clearly see that RC is dependent on RL:

class RegexLike regex source => RegexContext regex source target where

And RL, in turn,'s dependent on

class Extract source => RegexLike regex source where

And that — Extract — does indeed have a ByteString instance. So then RegexContext Regex ByteString ByteString should exist, but it doesn't. Behold:

λ: import Text.Regex.TDFA λ: import Data.ByteString λ: :set -XOverloadedStrings λ: :set -XFlexibleContexts λ: :t (=~) (=~) :: (RegexContext Regex source1 target, RegexMaker Regex CompOption ExecOption source) => source1 -> source -> target λ: let f = undefined :: RegexContext Regex a a => a -> String λ: let s = "foo" :: ByteString Loading package array-0.5.0.0 ... linking ... done. Loading package deepseq-1.3.0.2 ... linking ... done. Loading package bytestring-0.10.4.1 ... linking ... done. λ: :t f s <interactive>:1:1: No instance for (RegexContext Regex ByteString ByteString) arising from a use of ‘f’ In the expression: f s

Why's that?

Also: is there a better way of asking haskell this query? Maybe there's a way w/o adding FlexibleContexts?

submitted by anarchistic
[link] [comment]
Categories: Incoming News

Roman Cheplyaka: Denotational design does not work

Planet Haskell - Tue, 12/30/2014 - 4:00pm

Conal Elliott in his paper Denotational design with type class morphisms, as well as in the recent podcast, advocates denotational design as a principle of software design. Unfortunately, in my experience it never works for realistically complex problems.

On simplicity

First, I’ll formulate Conal’s approach as I understand it. For any given entity of your model, you should come up with a simple mathematical object — the denotation — that faithfully represents that entity.

The implementation of the type may vary, presumably to maximize its runtime efficiency, but it should not expose any more information than the chosen denotation has. That is considered an «abstraction leak». Conal specifically talks about that in the podcast (31m50s, for example).

Here I need to stress an important but subtle point in Conal’s principle: simplicity. You only follow Conal’s principle if you find a simple denotation, not just any denotation.

This point is important because without it any design is denotational design, trivially. Universal algebra tells us that for any set of operations and any (non-contradictory) laws about them there exists a model that satisfies these laws. For any Haskell module, we can interpret its types in the set theory (or a more complex domain if needed), and call that our denotation.

But that’s not what Conal is after. His approach is interesting exactly because he argues that it is possible to find simple denotations. This subtle point makes Conal’s approach simultaneously attractive and unrealistic. I’ll demonstrate this with two examples from my own work experience.

DSLs

At Barclays I worked on FPF, an embedded domain-specific language for describing financial instruments. In his paper, Conal shows how a denotation for such a DSL can quickly grow in complexity when requirements change. When variables and errors are introduced, the denotation changes from a to Env -> Result a. Still, this is a very simple DSL that only supports evaluation.

In reality, the main reason people make DSLs instead of using general-purpose languages is the ability to analyze DSL programs. One important feature of FPF is that it could pretty-print a program into a nice PDF. That poses an obvious problem — not every two semantically equivalent programs (under the interpretation semantics) result in equally nice PDFs. Inlining is a semantically sound transformation, but when our users get PDFs with all the definitions inlined, they get angry.

Sure, we could say that now our denotation becomes the domain product (Env -> Result a, String), where String is the pretty printer output. But in reality we have a dozen different analyses, and most of them are not expressible in terms of each other, or any single simple model. They also do not satisfy many laws. For instance, one day a user (quant or trader) could come and tell us that the barrier classifier should classify two mathematically equivalent expressions as different barriers because those expressions follow certain conventions. And even though the quant is mathematically inclined, denotations and type class morphism would be the last thing he wants to hear about in response to his feature request.

So, in practice, the best denotation for the DSL expressions was the AST itself. Which, according to my interpretation of Conal’s principles, is not an example of a denotational design, but a failure to apply one.

Machines

At my current job (Signal Vine), I work on a platform for scripted interaction with students via text messages. For every student enrolled in a messaging campaign, we send a message, receive a reply, process it, and the cycle repeats.

This is very similar to FRP; perhaps not the FRP Conal prefers (in the podcast he stresses the importance of continuous functions as opposed to events), but the kind of discrete FRP that Justin Le models with Mealy machines.

So it would seem that I should model a student as

newtype Student = Student (InboundSMS -> (OutboundSMS, Student))

That would be an exemplary case of denotational design. But that would be far from our customers’ needs. Every student has a set of profile variables that are filled when the student responds to a text, and our customers (counselors who work with that student) want to see those variables. They also want to see which messages were sent, what the student’s replies were, and even what messages will be sent to the student in the future. These requirements defeat the attempt to model a student in a simple, abstract way. Instead, I need to store all the information I have about the student because sooner or later I’ll need to expose that information to the user.

Conclusions

Denotational design is a very neat idea, but I believe that it only works in simple cases and when requirements are static. In real-world commercial programming, it breaks for two main reasons:

  1. Users often want maximum insight into what’s going on, and you need to break the abstraction to deliver that information.
  2. Requirements change, and an innocent change in requirements may lead to a drastic change and increase in complexity of the denotation.

It is certainly useful to think about denotations of your entities in specific, simple contexts (like the evaluation semantics for a DSL); such thought experiments may help you better understand the problem or even find a flaw in your implementation.

But when you are implementing the actual type, your best bet is to create an algebraic data type with all the information you have, so that when you need to extend the interface (or «leak the abstraction»), it won’t cause you too much pain.

Categories: Offsite Blogs

Edward Kmett: Fast Circular Substitution

Planet Haskell - Tue, 12/30/2014 - 2:43pm

Emil Axelsson and Koen Claessen wrote a functional pearl last year about Using Circular Programs for Higher-Order Syntax.

About 6 months ago I had an opportunity to play with this approach in earnest, and realized we can speed it up a great deal. This has kept coming up in conversation ever since, so I've decided to write up an article here.

In my bound library I exploit the fact that monads are about substitution to make a monad transformer that manages substitution for me.

Here I'm going to take a more coupled approach.

To have a type system with enough complexity to be worth examining, I'll adapt Dan Doel's UPTS, which is a pure type system with universe polymorphism. I won't finish the implementation here, but from where we get it should be obvious how to finish the job.

Unlike Axelsson and Claessen I'm not going to bother to abstract over my name representation.

To avoid losing the original name from the source, we'll just track names as strings with an integer counting the number of times it has been 'primed'. The name is purely for expository purposes, the real variable identifier is the number. We'll follow the Axelsson and Claessen convention of having the identifier assigned to each binder be larger than any one bound inside of it. If you don't need he original source names you can cull them from the representation, but they can be useful if you are representing a syntax tree for something you parsed and/or that you plan to pretty print later.

  data Name = Name String Int deriving (Show,Read)   hint :: Name -> String hint (Name n _) = n   nameId :: Name -> Int nameId (Name _ i) = i   instance Eq Name where (==) = (==) `on` nameId   instance Ord Name where compare = compare `on` nameId   prime :: String -> Int -> Name prime n i = Name n (i + 1)  

So what is the language I want to work with?

  type Level = Int   data Constant = Level | LevelLiteral {-# UNPACK #-} !Level | Omega deriving (Eq,Ord,Show,Read,Typeable)   data Term a = Free a | Bound {-# UNPACK #-} !Name | Constant !Constant | Term a :+ {-# UNPACK #-} !Level | Max [Term a] | Type !(Term a) | Lam {-# UNPACK #-} !Name !(Term a) !(Term a) | Pi {-# UNPACK #-} !Name !(Term a) !(Term a) | Sigma {-# UNPACK #-} !Name !(Term a) !(Term a) | App !(Term a) !(Term a) | Fst !(Term a) | Snd !(Term a) | Pair !(Term a) !(Term a) !(Term a) deriving (Show,Read,Eq,Ord,Functor,Foldable,Traversable,Typeable)  

That is perhaps a bit paranoid about remaining strict, but it seemed like a good idea at the time.

We can define capture avoiding substitution on terms:

  subst :: Eq a => a -> Term a -> Term a -> Term a subst a x y = y >>= \a' -> if a == a' then x else return a'  

Now we finally need to implement Axelsson and Claessen's circular programming trick. Here we'll abstract over terms that allow us to find the highest bound value within them:

  class Bindable t where bound :: t -> Int  

and instantiate it for our Term type

  instance Bindable (Term a) where bound Free{} = 0 bound Bound{} = 0 -- intentional! bound Constant{} = 0 bound (a :+ _) = bound a bound (Max xs) = foldr (\a r -> bound a `max` r) 0 xs bound (Type t) = bound t bound (Lam b t _) = nameId b `max` bound t bound (Pi b t _) = nameId b `max` bound t bound (Sigma b t _) = nameId b `max` bound t bound (App x y) = bound x `max` bound y bound (Fst t) = bound t bound (Snd t) = bound t bound (Pair t x y) = bound t `max` bound x `max` bound y  

As in the original pearl we avoid traversing into the body of the binders, hence the _'s in the code above.

Now we can abstract over the pattern used to create a binder in the functional pearl, since we have multiple binder types in this syntax tree, and the code would get repetitive.

  binder :: Bindable t => (Name -> t) -> (Name -> t -> r) -> String -> (t -> t) -> r binder bd c n e = c b body where body = e (bd b) b = prime n (bound body)   lam, pi, sigma :: String -> Term a -> (Term a -> Term a) -> Term a lam s t = binder Bound (`Lam` t) s pi s t = binder Bound (`Pi` t) s sigma s t = binder Bound (`Sigma` t) s  

We may not always want to give names to the variables we capture, so let's define:

lam_, pi_, sigma_ :: Term a -> (Term a -> Term a) -> Term a lam_ = lam "_" pi_ = pi "_" sigma_ = sigma "_"

Now, here's the interesting part. The problem with Axelsson and Claessen's original trick is that every substitution is being handled separately. This means that if you were to write a monad for doing substitution with it, it'd actually be quite slow. You have to walk the syntax tree over and over and over.

We can fuse these together by making a single pass:

  instantiate :: Name -> t -> IntMap t -> IntMap t instantiate = IntMap.insert . nameId   rebind :: IntMap (Term b) -> Term a -> (a -> Term b) -> Term b rebind env xs0 f = go xs0 where go = \case Free a -> f a Bound b -> env IntMap.! nameId b Constant c -> Constant c m :+ n -> go m :+ n Type t -> Type (go t) Max xs -> Max (fmap go xs) Lam b t e -> lam (hint b) (go t) $ \v -> rebind (instantiate b v env) e f Pi b t e -> pi (hint b) (go t) $ \v -> rebind (instantiate b v env) e f Sigma b t e -> sigma (hint b) (go t) $ \v -> rebind (instantiate b v env) e f App x y -> App (go x) (go y) Fst x -> Fst (go x) Snd x -> Snd (go x) Pair t x y -> Pair (go t) (go x) (go y)  

Note that the Lam, Pi and Sigma cases just extend the current environment.

With that now we can upgrade the pearl's encoding to allow for an actual Monad in the same sense as bound.

  instance Applicative Term where pure = Free (< *>) = ap   instance Monad Term where return = Free (>>=) = rebind IntMap.empty  

To show that we can work with this syntax tree representation, let's write an evaluator from it to weak head normal form:

First we'll need some helpers:

  apply :: Term a -> [Term a] -> Term a apply = foldl App   rwhnf :: IntMap (Term a) -> [Term a] -> Term a -> Term a rwhnf env stk (App f x) = rwhnf env (rebind env x Free:stk) f rwhnf env (x:stk) (Lam b _ e) = rwhnf (instantiate b x env) stk e rwhnf env stk (Fst e) = case rwhnf env [] e of Pair _ e' _ -> rwhnf env stk e' e' -> Fst e' rwhnf env stk (Snd e) = case rwhnf env [] e of Pair _ _ e' -> rwhnf env stk e' e' -> Snd e' rwhnf env stk e = apply (rebind env e Free) stk  

Then we can start off the whnf by calling our helper with an initial starting environment:

  whnf :: Term a -> Term a whnf = rwhnf IntMap.empty []  

So what have we given up? Well, bound automatically lets you compare terms for alpha equivalence by quotienting out the placement of "F" terms in the syntax tree. Here we have a problem in that the identifiers we get assigned aren't necessarily canonical.

But we can get the same identifiers out by just using the monad above:

  alphaEq :: Eq a => Term a -> Term a -> Bool alphaEq = (==) `on` liftM id  

It makes me a bit uncomfortable that our monad is only up to alpha equivalence and that liftM swaps out the identifiers used throughout the entire syntax tree, and we've also lost the ironclad protection against exotic terms.

But overall, this is a much faster version of Axelsson and Claessen's trick and it can be used as a drop-in replacement for something like bound in many cases, and unlike bound, it lets you use HOAS-style syntax for constructing lam, pi and sigma terms.

With pattern synonyms you can prevent the user from doing bad things as well. Once 7.10 ships you'd be able to use a bidirectional pattern synonym for Pi, Sigma and Lam to hide the real constructors behind. I'm not yet sure of the "best practices" in this area.

Here's the code all in one place:

[Download Circular.hs]

Happy Holidays,
-Edward

Categories: Offsite Blogs

binary only distribution packages

Haskell on Reddit - Tue, 12/30/2014 - 1:13pm

Is there a way to create a binary only cabal package without source code included?

Or if it is not possible is there a tutorial or manual on how to create linux distribution binary packages for haskell libraries?

submitted by vagif
[link] [4 comments]
Categories: Incoming News

Learn You Some Algebras for Glorious Good! - A fun, easy-to-read math textbook.

Haskell on Reddit - Tue, 12/30/2014 - 12:59pm

Most of you probably know Miran Lipovača's book, Learn You a Haskell for Great Good!. Well, I've started writing a math textbook, which is sort of like LYAH, but not quite. It's called Learn You Some Algebras for Glorious Good!.

The main critique of LYAH is that it's incomplete, so I'm going to do my best to rigidly cover all of my bases. The book is going to have categories, functors, monads, all that good stuff, right from the very start. I always found it quite an annoyance to have abstract concepts presented as advanced material, when they actually make the "less advanced" stuff much easier to understand.

The book is open-source, so anyone is welcome to contribute/fork/what-have-you. The book is licensed under the GNU Free Documentation License.

I haven't set up a website for the book yet. However,

If you have any questions, feel free to leave a comment or PM me. Thanks!

Edit: A lot of people are having trouble with the PDF's on Gitorious. Gitorious is also presenting a number of other problems, so I moved the project to GitLab.

submitted by pharpend
[link] [100 comments]
Categories: Incoming News

What really are free monads?

Haskell on Reddit - Tue, 12/30/2014 - 10:37am

I've been reading just a lot of stuff about Free monads, but I still can't see what they are, or how to apply them in my programs. I've seen a bunch of examples, mostly where a free monad is used to build-up a data structure which is then somehow evaluated.

My feel from reading answers like this one is that a Free monad is sort of like a monoid, which builds upon itself.

The problem I have with them is that I don't see many applications which aren't built just for the purpose of an example, and when I do see one, I don't really understand why is it a good idea to use a free monad in that case.

submitted by progfu
[link] [34 comments]
Categories: Incoming News

New gtk2hs 0.12.4 release

gtk2hs - Wed, 11/21/2012 - 12:56pm

Thanks to John Lato and Duncan Coutts for the latest bugfix release! The latest packages should be buildable on GHC 7.6, and the cairo package should behave a bit nicer in ghci on Windows. Thanks to all!

~d

Categories: Incoming News