News aggregator

Ask For Review: My First Haskell Project (Machine Learning and SGD Algorithm)

Haskell on Reddit - Sat, 12/06/2014 - 2:08am

I recently met Haskell and has since been fascinated by it. I can't explain the reason. Maybe simply because Haskell is so different from what I have been using: Java for infrastructure, Python for data crunching and machine learning, R for data analysis and plotting. I wanted to learn Haskell so I implemented one of my favorite machine learning algorithm: stochastic gradient descent (SGD). Here is the link to it. The core of the algorithm is implemented in Train.hs, the basic idea of which is iterating the input data points and update a weight vector accordingly.

I used the most obvious language syntax and features. There is inevitable many parts not haskell-ish (I didn't even tune foldr vs foldl), thus any suggestion is welcome. The aspects that I am aware of and would like suggestions the most are:

  • Heap consumption and lazy evaluation. One of the advantages of SGD is its low computational and space complexity. The data points (e.g. 10 Million ) do not need to fit in memory because they are checked one by one. Therefore, the memory consumption is supposed to be constant. However, the heap consumption of my Haskell implementation seems like beyond linear (e.g. for 100 MB of data, it consumes over 5 GB of memory until I manually kill it) . I appreciate the elegant of lazy loading the data. But I guess the extra heap consumption is due to the other lazy evaluations? It seems that I can force Strict evaluation in certain part. Given the nature of the algorithm, which part needs to force the Strictness. Is there the risk that strictness syntax pollutes all over the code? How do you solve such problems in general?

  • Language features for iterative algorithm. My algorithm iterate the data set multiple times, and a weight vector is updated each time a data point is seen. I basically implemented it with two levels of foldr(which would be two levels of for-loop in Java). This certainly involves encoding all the to-be-updated variables, and passing functions (or partially applied functions) as well. Is there any existing FP abstraction for iterative algorithm (or say, many levels of foldr, not necessarily recursion though)?

  • What else can be improved or replaced?

update 1 (2014-12-07)

I included some information on how one would run the code and where to find sample data.

I also did a basic memory tuning according to tom-md and bgamari, i.e. remove the call length x and those similar. I have posted the information on the same github page. Unfortunately, I do not see any improvement, and the memory consumption is linear rather than the expected constant. Any further advice would be very appreciated.

submitted by causality0
[link] [3 comments]
Categories: Incoming News

I want to get some haskell programming done in android. What are some means of doing this?

Haskell on Reddit - Fri, 12/05/2014 - 7:19pm

I definitely don't want to host a virtual machine. So far I am considering having a virtual private server which I SSH into, but that costs money. What I would really like is to use some kind of container like docker.

submitted by read_harder
[link] [5 comments]
Categories: Incoming News

Roman Cheplyaka: Extensible effects: abstracting from the transformer

Planet Haskell - Fri, 12/05/2014 - 4:00pm

In Type-based lift, we saw a way to lift monadic actions automatically to the right layer of a multilayer monad transformer stack, based only on the types involved.

Namely, we defined a closed type family

type family Find (t :: (* -> *) -> (* -> *)) (m :: * -> *) :: Nat where Find t (t m) = Zero Find t (p m) = Suc (Find t m)

that computes the type-level index of the layer t in the stack m. Such an index can then be used to construct an appropriate lifting function of type t n a -> m a.

This works well as a shortcut, so instead of writing lift, or lift . lift, or lift . lift . lift, we can write magicLift, and let it figure out how far to lift.

However, the lifting is expressed in terms of specific transformers, and not the effects they can handle. For example, a stateful computation may target the strict State monad or the lazy one, but not both, because they are implemented by distinct types.

Let’s fix this!

CanDo

To know which effects can be handled by each transformer, we’ll introduce a new type family, CanDo:

type family CanDo (m :: (* -> *)) (eff :: k) :: Bool

Now we need to modify the Find family to find the first (top-most) layer for which the CanDo predicate will return True. Since on the type level we don’t have lambdas and case statements, doing so is a bit cumbersome but still possible:

type family MapCanDo (eff :: k) (stack :: * -> *) :: [Bool] where MapCanDo eff (t m) = (CanDo (t m) eff) ': MapCanDo eff m MapCanDo eff m = '[ CanDo m eff ] type family FindTrue (bs :: [Bool]) :: Nat where FindTrue (True ': t) = Zero FindTrue (False ': t) = Suc (FindTrue t) type Find eff (m :: * -> *) = FindTrue (MapCanDo eff m)

Next, we need to introduce dummy types denoting effects, and relate them to transformers:

import qualified Control.Monad.Trans.State.Lazy as SL import qualified Control.Monad.Trans.State.Strict as SS data EffState (s :: *) data EffWriter (w :: *) data EffReader (e :: *) type instance CanDo (SS.StateT s m) eff = StateCanDo s eff type instance CanDo (SL.StateT s m) eff = StateCanDo s eff type family StateCanDo s eff where StateCanDo s (EffState s) = True StateCanDo s (EffReader s) = True StateCanDo s (EffWriter s) = True StateCanDo s eff = False

As we see, the relationship between effects and transformers is many-to-many. A single effect can be implemented by multiple transformers, and a single transformer can implement multiple effects.

Should StateT implement EffReader?

It’s not only for demonstration that I made StateT implement the EffReader effect. One can view EffReader (and EffWriter) computations as a subclass of all stateful computations

Suppose that you have two computations with the same state that you want to execute sequentially, but one of them only needs read access to the state. Why not express that fact in its type?

Other cool tricks

Here are a couple of cool (and useful!) tricks that are possible with extensible effects. I will only describe what they do and not how they’re implemented; you can find all code in the repo.

Zooming newtype ZoomT big small m a = ...

ZoomT handles EffState small effects by transforming them to EffState big effects. To enable this, we must supply a lens from big into small:

runZoom :: Lens big small -> ZoomT big small m a -> m a

Compared to traditional zooming (as in lens), where we can only focus on a single state at a time, here we can apply many ZoomTs stacked on top of each other, thus handling multiple different EffState effects simultaneously.

Custom Writer handlers

The classic use case for a Writer monad is logging. Ironically, the way a writer monad does logging (accumulating log messages in memory) is wrong for almost all purposes, and possible right ways (flushing messages to disk or sending them over the network) are out of reach for it.

newtype CustomWriterT w m a = ... evalWriterWith :: (w -> m ()) -> CustomWriterT w m a -> m a

The idea here is that CustomWriterT handles EffWriter effect by calling the given handler for it — exactly what we wanted!

Categories: Offsite Blogs

Weird abuse of lexer and the type system (beginner)

Haskell on Reddit - Fri, 12/05/2014 - 1:12pm

I'm still just a novice, so what's weird to me may be trivial to another, but I was completely dumbfounded when this code ran as expected:

(+--) :: Num a => a -> a -> a -> a (x +-- y) r = x - r * (x - y) sample :: (Num a, Show a) => (a, a) -> a -> String sample seg r = concat [show r, ":\t", show $ uncurry (+--) seg r] main :: IO () main = mapM_ (putStrLn . sample (2, 8)) [-1.0, -0.75.. 2.0] -- output: -1.0: -4.0 -0.75: -2.5 -0.5: -1.0 -0.25: 0.5 0.0: 2.0 0.25: 3.5 0.5: 5.0 0.75: 6.5 1.0: 8.0 1.25: 9.5 1.5: 11.0 1.75: 12.5 2.0: 14.0

What's so weird? Well, -- appears outside of a string literal but does not induce a comment string. And I guess it's not surprising that (+--) can take three parameters, since it's a really a function a -> b -> c, where c happens to be a -> a. But it still feels really weird to me.

submitted by lynko
[link] [3 comments]
Categories: Incoming News

Functional Jobs: Senior Software Engineer at McGraw-Hill Education (Full-time)

Planet Haskell - Fri, 12/05/2014 - 9:43am

This Senior Software Engineer position is with the new LearnSmart team at McGraw-Hill Education's new and growing Research & Development center in Boston's Innovation District.

We make software that helps college students study smarter, earn better grades, and retain more knowledge.

The LearnSmart adaptive engine powers the products in our LearnSmart Advantage suite — LearnSmart, SmartBook, LearnSmart Achieve, LearnSmart Prep, and LearnSmart Labs. These products provide a personalized learning path that continuously adapts course content based on a student’s current knowledge and confidence level.

On our team, you'll get to:

  • Move textbooks and learning into the digital era
  • Create software used by millions of students
  • Advance the state of the art in adaptive learning technology
  • Make a real difference in education

Our team's products are built with Flow, a functional language in the ML family. Flow lets us write code once and deliver it to students on multiple platforms and device types. Other languages in our development ecosystem include especially JavaScript, but also C++, SWF (Flash), and Haxe.

If you're interested in functional languages like Scala, Swift, Erlang, Clojure, F#, Lisp, Haskell, and OCaml, then you'll enjoy learning Flow. We don't require that you have previous experience with functional programming, only enthusiasm for learning it. But if you have do some experience with functional languages, so much the better! (On-the-job experience is best, but coursework, personal projects, and open-source contributions count too.)

We require only that you:

  • Have a solid grasp of CS fundamentals (languages, algorithms, and data structures)
  • Be comfortable moving between multiple programming languages
  • Be comfortable with modern software practices: version control (Git), test-driven development, continuous integration, Agile

Get information on how to apply for this position.

Categories: Offsite Blogs

Philip Wadler: Functional Programming is the New Black

Planet Haskell - Fri, 12/05/2014 - 6:24am
Reading the conclusion of my friend Maurice Naftalin's new book on Java 8 brought home to me the extent to which Functional Programming is, indeed, the new black. You can read it here.

Maurice Naftalin, Mastering Lambdas (Conclusion), McGraw-Hill, 2014.
Other links: Java 8 Lambda FAQ, Mastering Lambdas.
Categories: Offsite Blogs

Well-Typed.Com: Compose conference and New York City Haskell courses

Planet Haskell - Fri, 12/05/2014 - 5:56am

Well-Typed is happy to announce that we are sponsoring

C◦mp◦se conference

Friday, January 30 – Sunday, February 1, 2015, New York City

This conference is focused on functional programming and features a keynote by Stephanie Weirich on dependent types as well as invited talks by Anthony Cowley, Maxime Ransan and Don Syme, plus a whole lot of additional contributed talks. There’s also an “unconference” with small workshops and tutorials as well as the opportunity to get your hands dirty and try things out yourself.

For several years now, we have been running successful Haskell courses in collaboration in Skills Matter. For the C◦mp◦se conference, we have decided to bring theses courses to New York! You can participate in our Haskell courses directly before or directly after the conference (or both):

Fast Track to Haskell

Thursday, January 29 – Friday, January 30, 2015, New York City

(Don’t worry, there’s no overlap with Stephanie Weirich’s keynote on Friday evening that marks the start of C◦mp◦se.)

You can register here.

This course is for developers who want to learn about functional programming in general or Haskell in particular. It introduces important concepts such as algebraic datatypes, pattern matching, type inference, polymorphism, higher-order functions, explicit effects and, of course, monads and provides a compact tour with lots of hands-on exercises that provide a solid foundation for further adventures into Haskell or functional programming.

Advanced Haskell

Monday, February 2 – Tuesday, February 3, 2015, New York City

You can register here.

This course is for developers who have some experience in Haskell and want to know how to work on larger projects and how things scale. The course covers important topics such as selecting the right data structures for a task at hand, taking the functional perspective into account, and takes a thorough look at Haskell’s “lazy evaluation” and how to reason about time and space performance of Haskell programs. There’s also some focus on how to use Haskell’s powerful abstraction mechanisms and concepts such as Applicative Functors, Monads and Monad Transformers to help you organize larger code bases. Finally, depending on time and demand, there’s the opportunity to look at Parallelism and Concurrency, or at type-level programming. Once again, the course comes with several carefully designed hands-on exercises and provides room for specific questions that the participants might have.

Both courses will be taught by Duncan Coutts, co-founder and partner at Well-Typed. He’s both an experienced teacher and is involved in lots of commercial Haskell development projects at Well-Typed. He’s seen a lot of Haskell code, and knows perfectly which techniques and approaches work and which do not.

Well-Typed training courses

In general, our courses are very practical, but don’t shy away from theory where necessary. Our teachers are all active Haskell developers with not just training experience, but active development experience as well. In addition to these two courses in New York City, we regularly offer courses in London, and plan to offer courses in other European locations, too.

We also provide on-site training on requests nearly anywhere in the world. If you want to know more about our training or have any feedback or questions, have a look at our dedicated training page or just drop us a mail.

Categories: Offsite Blogs

Autocompletion in Emacs

Haskell on Reddit - Fri, 12/05/2014 - 5:19am

Hello Haskellers,

I have finally got both haskell mode and ghc-mod to work correctly in emacs. This has been a big boon to my productivity and has generally much increased emacs' awesomeness.

The one thing that is missing is good auto completion. What I would like to have is reliable auto completion of modules that I import. (This is generally much more useful to me than autocompletion of my own modules b/c I generally remember pretty well what I have just written but not all the functions that are exported by module ABC).

Example: I have

import qualified Data.ByteString as B

now when I type

B.

and run the auto-completion command I would like to see all functions exported from that module (bonus points for type signatures and docstrings).

Is something like this currently possible?

submitted by paulkoer
[link] [13 comments]
Categories: Incoming News

LHC Team: Compiling to JavaScript.

Planet Haskell - Thu, 12/04/2014 - 8:21pm
Lots of very interesting things are possible when everything (including the runtime system) is translated to LLVM IR. For example, compiling to JavaScript becomes trivial. Consider this ugly version of Hello World:


{-# LANGUAGE MagicHash #-}
module Main (main) where

import LHC.Prim

putStrLn :: List Char -> IO Unit
putStrLn msg = putStr msg `thenIO` putStr (unpackString# "\n"#)

main :: IO Unit
main = putStrLn (unpackString# "Hello World!"#)

entrypoint :: Unit
entrypoint = unsafePerformIO main

Notice the 'List' and 'Unit' types, and the 'thenIO' and  'unpackString#' functions. There's no syntactic sugar in LHC yet. You can get everything sugar-free these days, even Haskell compilers.

Running the code through the LLVM dynamic compiler gives us the expected output:

# lli Hello.ll
Hello World!

Neato, we have a complete Haskell application as a single LLVM file. Now we can compile it to JavaScript without having to worry about the garbage collector or the RTS; Everything has been packed away in this self-contained file.

$ emcc -O2 Hello.ll -o Hello.js # Compile to JavaScript using
# emscripten.
$ node Hello.js # Run our code with NodeJS.
Hello World!

$ ls -lh Hello.js # JavaScript isn't known to be
# terse but we're still smaller
# than HelloWorld compiled with GHC.
-rw-r--r-- 1 lemmih staff 177K Dec 4 23:33 Hello.js
Categories: Offsite Blogs

Oliver Charles: 24 Days of GHC Extensions: Bang Patterns

Planet Haskell - Thu, 12/04/2014 - 6:00pm

Over the last few days, we’ve been looking at various GHC extensions that centre around forming bindings. Today I’d like to look at one more extension in this area - bang patterns. Much like with record wildcards yesterday, the extension is small, yet extremely useful.

> {-# LANGUAGE BangPatterns #-} > import Data.Function (fix) > import Data.List (foldl')

Generally speaking, bang patterns allow us to annotate pattern matches to indicate that they should be strict. To understand this, we should start by understanding the interaction between pattern matching and Haskell’s evaluation strategy. When we are writing functions, any inputs to the function will not be evaluated until we pattern match on them. For example, the following contrived function doesn’t pattern match on its argument, so it doesn’t force any evaluation on it:

> hello :: Bool -> String > hello loud = "Hello."

If we apply hello to various arguments, the behaviour is the same - even for undefined values:

-> hello True "Hello." -> hello False "Hello." -> hello undefined "Hello." -> hello (fix id) "Hello."

However, by pattern matching on the Bool, we force evaluation of loud:

> hello2 :: Bool -> String > hello2 True = "Hello!" > hello2 False = "hello" -> hello2 True "Hello!" -> hello2 False "hello" -> hello2 undefined *** Exception: Prelude.undefined -> hello2 (fix id) "*** Exception: <<loop>>

Specifically, the pattern match will evaluate the input argument enough to perform the pattern match - to determine which pattern is appropriate. Usually this would be evaluation to weak head normal form, but that’s not strictly true with nested pattern matches. For more of a discussion on this, interested readers are pointed to Simon Marlow’s book Parallel and Concurrent Programming in Haskell, which has a fantastic discussion on this.

But what does this all have to do with bang patterns? Bang patterns is an extension that will evaluate specific arguments to weak head normal form regardless of the pattern match performed. If we revisit our example hello function, rewriting it with bang patterns, we have

> hello3 :: Bool -> String > hello3 !loud = "Hello."

This function will now produce values only if loud evaluates to True or False:

-> hello3 True "Hello." -> hello3 False "Hello." -> hello3 undefined *** Exception: Prelude.undefined -> hello3 (fix id) "*** Exception: <<loop>>

So much for theory, but why would you want to do such a thing? Bang patterns are a fantastic extension when you don’t need Haskell’s implicit laziness. A common case is when performing computations over large lists of data. If we’re just summarising a list or collection, forcing the value at every step leads to considerably better memory usage, and that in turn leads to better performance. Johan Tibell - an expert in the realm of high performance haskell - has a lovely example of where bang patterns are useful, in this snippet for calculating the mean of a list of Doubles:

> mean :: [Double] -> Double > mean xs = s / fromIntegral l > where > (s, l) = foldl' step (0, 0) xs > step (!s, !l) a = (s + a, l + 1)

Here we’re finding the mean of a list of numbers. If we kept this entirely lazy, we’ll build up a huge computation - a + b + c + d + e + ... and 0 + 1 + 1 + 1 + 1 + ..., for the entire length of the list! This is a horrible usage of memory, and we don’t need this laziness. It looks like using foldl' should be sufficient, but note that foldl' only evaluates to weak head normal form. In this case, that’s the pair of Doubles but not the Doubles themselves! Therefore we use bang patterns on s and l, forcing every step of the computation to evaluate the underlying Double.

It may be illuminating to consider the desugared version of the program:

mean :: [Double] -> Double mean xs = s / fromIntegral l where (s, l) = foldl' step (0, 0) xs step (s, l) a = let s' = s + a l' = l + 1 in s' `seq` l' `seq` (s', l')

This program is equivalent in strictness, but as you can see - syntactically we had to do a lot more work to get there.

In conclusion, bang patterns are a lovely extension for working with high performance code. I particularly like that we can indicate strictness syntactically, which I find makes scanning through code to understand its evaluation strategy clearer than looking for seqs. Also, BangPatterns are so lightweight, when we are trying to optimise our program - often an inherently experimental process - it’s easy to swap out different variations on strictness.

This post is part of 24 Days of GHC Extensions - for more posts like this, check out the calendar.

Categories: Offsite Blogs

Edward Z. Yang: Ubuntu Utopic upgrade (Xmonad)

Planet Haskell - Thu, 12/04/2014 - 5:48pm

I finally got around to upgrading to Utopic. A year ago I reported that gnome-settings-daemon no longer provided keygrabbing support. This was eventually reverted for Trusty, which kept everyone's media keys.

I'm sorry to report that in Ubuntu Utopic, the legacy keygrabber is no more:

------------------------------------------------------------ revno: 4015 [merge] author: William Hua <william.hua@canonical.com> committer: Tarmac branch nick: trunk timestamp: Tue 2014-02-18 18:22:53 +0000 message: Revert the legacy key grabber. Fixes: https://bugs.launchpad.net/bugs/1226962.

It appears that the Unity team has forked gnome-settings-daemon into unity-settings-daemon (actually this fork happened in Trusty), and as of Utopic gnome-settings-daemon and gnome-control-center have been gutted in favor of unity-settings-daemon and unity-control-center. Which puts us back in the same situation as a year ago.

I don't currently have a solution for this (pretty big) problem. However, I have solutions for some minor issues which did pop up on the upgrade:

  • If your mouse cursor is invisible, try running gsettings set org.gnome.settings-daemon.plugins.cursor active false
  • If you don't like that the GTK file dialog doesn't sort folders first anymore, try running gsettings set org.gtk.Settings.FileChooser sort-directories-first true. (Hat tip)
  • And to reiterate, replace calls to gnome-settings-daemon with unity-settings-daemon, and use unity-control-panel to do general configuration.
Categories: Offsite Blogs

Robin KAY: HsQML 0.3.2.1 released: London Edition

Planet Haskell - Thu, 12/04/2014 - 5:18pm
Last week I gave a talk to the London Haskell User Group on my GUI library HsQML. The slides are available now and a video of the talk will be posted on the group's YouTube channel in due course (I'll post again when that happens).

The most distinctive, some might say contentious, thing about HsQML compared to other Haskell GUI libraries is the split between implementing the back-end logic of an application in Haskell and describing its user interface using the QML domain specific language. This tends to frighten people off and I was at pains to stress that while QML does have inbuilt scripting capabilities, we can build real applications with just a thin layer of QML over our Haskell code.

The talk walking through the implementation of a new sample "sticky notes" application. Here, the Haskell back-end takes care of persisting the user's data in an SQLite database and exposes a data model to QML. Several alternate QML front-ends then show how the same data model can be skinned with different user interfaces.

One of the sticky notes application's front-ends uses Qt Quick Controls for a native look and feel, shown here on three platforms.
Belatedly also, I'm announcing version 0.3.2.1 of HsQML which debuted on Hackage at the week-end. This minor release fixes a couple of bugs. Most notably, that fact that HsQML was leaking the QApplication object and this caused programs to occasionally crash on exit under Linux. HsQML now ships an OnExitHook() which should shutdown the Qt framework when the GHC RTS does, or alternatively you can call the new shutdownQt function to do it manually.

release-0.3.2.1 - 2014.11.29

* Added function to shutdown the Qt framework.
* Fixed intermittent crash on exit under Linux.
* Fixed reanimated objects being passed to QML as undefined.
* Fixed typo in the names of implicit property signals.
Categories: Offsite Blogs

Alessandro Vermeulen: Orchestration support announced on DockerCon

Planet Haskell - Thu, 12/04/2014 - 3:46pm
Orchestration

The philosophy behind docker is that in order to be solved, a large problem has to be divided into its root problems. One can then proceed by solving every one of these problems step by step. Additionally all elements of the solution need to communicate through a common app.

Docker has always been a tool with a single purpose: the creating, transport, and running of images. Until today there where several issues with docker that make using it somewhat trying at times. It lacked in capabilities for orchestration which is categorized by:

  1. Installation of a docker host from scratch;
  2. Clustering of multiple docker hosts to spread resource utilization over the cluster;
  3. Managing inter-container dependencies at runtime.

Today this changed as Docker Inc. announced a new set of tools.

Provisioning: Machine

Machine provides a one step installer for creating a new docker host on your local machine, a publicly hosted cloud, or a private cloud. It will automatically provision a new machine and set the environment variables such that any following docker command runs on the newly created host. This is very similar to what boot2docker provides.

There are several engines for provisioning in different platforms such as:

  • VirtualBox
  • VMWare
  • AWS
  • Microsoft Hypervisor
  • etc. (todo link)

More information can be found at github.

Clustering: Swarm

Ideally you want to control a cluster of docker hosts with the same interface as you control a single host. In other words the interface needs to be transparent or standardized. With swarm you can.

All existing commands on docker work with the swarm as well. Just point your docker binary to the swarm proxy and you are controlling the cluster instead of one single machine. Swarm is location/data center aware and also incorporates resource management. The default strategy is to use as little hosts as possible. The strategy places several lighter containers on the same node in order to reserve other nodes for heavier containers.

The main features are:

  1. Resource management
  2. Scheduling honoring constraints
  3. Health checks on the cluster and nodes
  4. Supporting the entire docker interface

Additionally Mesos can be used to provide the scheduling. Docker Inc. also announced that Mesos will be a first class citizen in Docker. The goal is to be able to run docker containers along side other Mesos jobs in the Mesos cluster.

More information also on github.

It appears that swarm is not supported yet by machine, sadly.

Managing inter-container dependencies: Composer

Setting up applications that require multiple containers to function correctly is difficult. Keeping them running is even harder. Docker proposes the Docker Composer.

Traditionally it ran on one single machine and, until today, orchestration needed to be done manually or through external tools.

Docker Hub

Docker Inc. also announces an enterprise version of the Docker Hub. It is able to run wherever the enterprise needs it to run and comes with safe 1-click upgrades. Enterprises are adopting containers as development is up to 30 times faster with halve the error rate.

Some fun facts:

  • 100000 contributors to docker hub
  • 157 TB of data transmitted each month
  • 50 TB of data stored

The timeline for 2015:

  1. Increase performance of pulls
  2. Increase transparancy by adding and improving on status pages
  3. Engage in partnership with Microsoft. Most notably this will result being able to run Linux on Microsoft Azure.
Categories: Offsite Blogs