I'm implementing the low-level interface to a large C library using c2hs. First I'll briefly describe the low-level setting and then I'll ask a few questions concerning the design of the higher-level interface (I've asked the same q. on haskell-cafe so feel free to reply here or there).
The C functions can be broadly classified with a few criteria:
the main datastructure they operate on, e.g. returnType AFunction (A* a, ...);
creation, modification, read access, destruction operations for each of these datastructures, following a common pattern, e.g.: ACreate(), ASet1(), ASet2(), .., ARead(), ADestroy(), BCreate(), etc. The library requires explicit destruction of all Create()d objects.
The library rests on top of MPI, so there are e.g. AScatterBegin(), AScatterEnd() function pairs that account for potentially large delays associated with computing and moving data around.
On the Haskell side, the C pointer types are represented (using GeneralizedNewtypeDeriving and StandaloneDeriving) asnewtype Ah = Ah (Ptr Ah) deriving instance Storable Ah
and e.g. a Create() function returns the representation of the fresh pointer along with the usual integer error code:ahCreate :: MPIComm -> IO (Ah, Err) Questions:
What abstraction may I use to capture logic such as:
separate read-only functions from the potentially overwriting ones
guarantee precedence relations (e.g. computation can only occur after initialization)
collect or even compose error codes, letting computation go through only if Err==0 (e.g. a n-ary Maybe ?) Sequencing in the IO monad causes all effectful functions to run and return their result and/or error code.
reflect the deallocation of a C-land object on the Hs side.
All feedback is welcome; Thank you in advancesubmitted by ocramz
[link] [6 comments]
I mean, Haskell is not very popular yet, even though it is an awesome and fast language. Do you think Haskell can be a big deal in the future or not? Does OOP offer more possibilities? Or will they both coexist just as they do now? I just wonder if there clearly is a "winner" between OOP and FP.submitted by sammecs
[link] [38 comments]
tl;dr Please check out beta.stackage.org
I made the first commit to the Stackage Server code base a little over a year ago. The goal was to provide a place to host package sets which both limited the number of packages from Hackage available, and modified packages where necessary. This server was to be populated by regular Stackage builds, targeted at multiple GHC versions, and consisted of both inclusive and exclusive sets. It also allowed interested individuals to create their own package sets.
If any of those details seem surprising today, they should. A lot has happened for the Stackage project in the past year, making details of what was initially planned irrelevant, and making other things (like hosting of package documentation) vital. We now have LTS Haskell. Instead of running with multiple GHC versions, we have Stackage Nightly which is targeted at a single GHC major version. To accomodate goals for GPS Haskell (which unfortunately never materialized), Stackage no longer makes corrections to upstream packages.
I could go into lots more detail on what is different in project requirements. Instead, I'll just summarize: I've been working on a simplified version of the Stackage Server codebase to address our goals better, more easily ensure high availability, and make the codebase easier to maintain. We also used this opportunity to test out a new hosting system our DevOps team put together. The result is running on beta.stackage.org, and will replace the official stackage.org after a bit more testing (which I hope readers will help with).The code
All of this code lives on the simpler branch of the stackage-server code base, and much to my joy, resulted in quite a bit less code. In fact, there's just about a 2000 line reduction. The rest of this post will get into how that happened.No more custom package sets
One of the features I mentioned above was custom package sets. This fell out automatically from the initial way Stackage Server was written, so it was natural to let others create package sets of their own. However, since release, only one person actually used that feature. I discussed with him, and he agreed with the decision to deprecate and then remove that functionality.
So why get rid of it now? Two powerful reasons:
- We already host a public mirror of all packages on S3. Since we no longer patch upstream packages, it's best if tooling is able to just refer to that high-reliability service.
- We now have Git repositories for all of LTS Haskell and Stackage Nightly. Making these the sources of package sets means we don't have two (possibly conflicting) sources of data. That brings me to the second point
We had some complicated logic to allow users to upload package sets. It started off simple, but over time we added Haddock hosting and other metadata features, making the code more complex. Actually, it ended up having two parallel code paths for this. So instead, we now just upload information on the package sets to the Git repositories, and leave it up to a separate process (described below) to clone these repositories and make the data available to the server.Haddocks on S3
After generating a snapshot, the Haddocks used to be tarred and compressed, and then uploaded as a compressed bundle to S3. Then, Stackage Server would receive a request for files, unpack them, and serve them. This presented some problems:
- Users would have to wait for a first request to succeed during the unpacking
- With enough snapshots being generated, we would eventually run out of disk space and need to clear our temp directory
- Since we run our cluster in a high availabilty mode with multiple horizontally-scaled machines, one machine may have finished unpacking when another didn't, resulting in unstyled content (see issue #82).
Instead, we now just upload the files to S3 and redirect there from stackage-server (though we'll likely switch to reverse proxying to allow for nicer SSL urls). In fact, you can easily view these docs, at URLs such as http://haddock.stackage.org/lts-2.9/ or https://s3.amazonaws.com/haddock.stackage.org/nightly-2015-05-21/index.html.
These Haddocks are publicly available, and linkable from projects beyond Stackage Server. Each set of Haddocks is guaranteed to have consistent internal links to other compatible packages. And while some documentation doesn't generate due to known package bugs, the generation is otherwise reliable.
I've already offered access to these docs to Duncan for usage on Hackage, and hope that will improve the experience for users there.Metadata SQLite database
Previously, information on snapshots was stored in a PostgreSQL database that was maintained by Stackage Server. This database also had package metadata, like author, homepage, and description. Now, we have a completely different process:
- The all-cabal-metadata from the Commercial Haskell Special Interest Group provides an easily cloneable Git repo with package metadata, which is automatically updated by Travis.
- We run a cron job on the stackage-build server that updates the lts-haskell, stackage-nightly, and all-cabal-metadata repos and generates a SQLite database from them with all of the data that Stackage Server needs. You can look at the Stackage.Database module for some ideas of what this consists of. That database gets uploaded to Amazon S3, and is actually publicly available if you want to poke at it
- The live server downloads a new version of this file on a regular basis
I've considered spinning off the Stackage.Download code into its own repository so that others can take advantage of this functionality in different contexts if desired. Let me know if you're interested.
At this point, the PostgreSQL database is just used for non-critical functionality, such as social features (tags and likes).Slightly nicer URLs
When referring to a snapshot, there are "official" short names (slugs), of the form lts-2.9 and nightly-2015-05-22. The URLs on the new server now reflect this perfectly, e.g.: https://beta.stackage.org/nightly-2015-05-22. We originally used hashes of the snapshot content for the original URLs, but that was fixed a while ago. Now that we only have to support these official snapshots, we can always (and exclusively) use these short names.
As a convenience, if you visit the following URLs, you get automatic redirects:
- /nightly redirects to the most recent nightly
- /lts to the latest LTS
- /lts-X to the latest LTS in the X.* major version (e.g., today, /lts-2 redirects to /lts-2.9)
This also works for URLs under that hierarchy. For example, consider https://beta.stackage.org/lts/cabal.config, which is an easy way to get set up with LTS in your project (by running wget https://beta.stackage.org/lts/cabal.config).ECS-based hosting
While not a new feature of the server itself, the hosting cluster we're running this on is brand new. Amazon recently released EC2 Container Service, which is a service for running Docker containers. Since we're going to be using this for the new School of Haskell, it's nice to be giving it a serious usage now. We also make extensive use of Docker for customer projects, both for builds and hosting, so it's a natural extension for us.
This ECS cluster uses standard Amazon services like Elastic Load Balancer (ELB) and auto-scaling to provide for high availability in the case of machine failure. And while we have a lot of confidence in our ability to keep Stackage Server up and running regularly, it's nice that our most important user-facing content is provided by these external services:
- Haddocks on S3
- Package mirroring on S3
- LTS Haskell and Stackage Nightly build plans on Github
- Package metadata on Github
- Package index metadata on Github (via stackage-update and all-cabal-files/hashes)
This provides for a pleasant experience in both browsing the website and using Stackage in your build system.
A special thanks to Jason Boyer for providing this new hosting cluster, which the whole FP Complete team is looking forward to putting through its paces.
Hi guys, I'm learning Haskell by doing Project Euler problems. I've gotten the hang of using the State monad along with Map when I would normally use a hash table. But now I need two separate hash tables, and as far as I can tell State only allows one persistent data structure. For now, I can make a Map that stores an extra Bool to differentiate, but what is the real way to do this? Thanks in advance for any advice.submitted by peterlew
[link] [9 comments]
I've been scratching my head for a while coming up with a proper instance of MonadFix for ParsecT (or even Parsec, assuming -XTypeSynonymInstances.) That is:instance Monad m => MonadFix (ParsecT s u m) where mfix f = ???
Any help would be appreciated. Thank you!submitted by analphabetic
[link] [2 comments]
Now that accelerate can run computations on the cpu (link) and performs better, is there still a need for repa?submitted by precalc
[link] [4 comments]
Why on earth is cabal so miserable to work with? Why can't I uninstall packages without having to nuke my installation? No, I don't care if you say it's not a package manager, its installing shit on my system, therefore its a package manager. I love coding haskell, but cabal is so shit and worthless. I'd rather code in Java than dick around in Haskell, and I hate Java. If anything is going to kill Haskell, its this trash.submitted by dpakattack
[link] [11 comments]
Summary: The development version of ghcid seemed to have some problems with terminating when Control-C was hit, so I investigated and learnt some things.
Given a long-running/interactive console program (e.g. ghcid), when the user hits Control-C/Ctrl-C the program should abort. In this post I'll describe how that works in Haskell, how it can fail, and what asynchronous exceptions have to do with it.What happens when the user hits Ctrl-C?
When the user hits Ctrl-C, GHC raises an async exception of type UserInterrupt on the main thread. This happens because GHC installs an interrupt handler which raises that exception, sending it to the main thread with throwTo. If you install your own interrupt handler you won't see this behaviour and will have to handle Ctrl-C yourself.
There are reports that if the user hits Ctrl-C twice the runtime will abort the program. In my tests, that seems to be a feature of the shell rather than GHC itself - in the Windows Command Prompt no amount of Ctrl-C stops an errant program, in Cygwin a single Ctrl-C works.What happens when the main thread receives UserInterrupt?
There are a few options:
- If you are not masked and there is no exception handler, the thread will abort, which causes the whole program to finish. This behaviour is the desirable outcome if the user hits Ctrl-C.
- If you are running inside an exception handler (e.g. catch or try) which is capable of catching UserInterrupt then the UserInterrupt exception will be returned. The program can then take whatever action it wishes, including rethrowing UserInterrupt or exiting the program.
- If you are running with exceptions masked, then the exception will be delayed until you stop being masked. The most common way of running while masked is if the code is the second argument to finally or one of the first two arguments to bracket. Since Ctrl-C will be delayed while the program is masked, you should only do quick things while masked.
The easiest way to "lose" a UserInterrupt is to catch it and not rethrow it. Taking a real example from ghcid, I sometimes want to check if two paths refer to the same file, and to make that check more robust I call canonicalizePath first. This function raises errors in some circumstances (e.g. the directory containing the file does not exist), but is inconsistent about error conditions between OS's, and doesn't document its exceptions, so the safest thing is to write:canonicalizePathSafe :: FilePath -> IO FilePath
canonicalizePathSafe x = canonicalizePath x `catch`
\(_ :: SomeException) -> return x
If there is any exception, just return the original path. Unfortunately, the catch will also catch and discard UserInterrupt. If the user hits Ctrl-C while canonicalizePath is running the program won't abort. The problem is that UserInterrupt is not thrown in response to the code inside the catch, so ignoring UserInterrupt is the wrong thing to do.What is an async exception?
In Haskell there are two distinct ways to throw exceptions, synchronously and asynchronously.
- Synchronous exceptions are raised on the calling thread, using functions such as throw and error. The point at which a synchronous exception is raised is explicit and can be relied upon.
- Asynchronous exceptions are raised by a different thread, using throwTo and a different thread id. The exact point at which the exception occurs can vary.
In Haskell, there is a type called AsyncException, containing four exceptions - each special in their own way:
- StackOverflow - the current thread has exceeded its stack limit.
- HeapOverflow - never actually raised.
- ThreadKilled - raised by calling killThread on this thread. Used when a programmer wants to kill a thread.
- UserInterrupt - the one we've been talking about so far, raised on the main thread by the user hitting Ctrl-C.
While these have a type AsyncException, that's only a hint as to their intended purpose. You can throw any exception either synchronously or asynchronously. In our particular case of caonicalizePathSafe, if canonicalizePath causes a StackOverflow, we probably are happy to take the fallback case, but likely the stack was already close to the limit and will occur again soon. If the programmer calls killThread that thread should terminate, but in ghcid we know this thread won't be killed.How can I catch avoid catching async exceptions?
There are several ways to avoid catching async exceptions. Firstly, since we expect canonicalizePath to complete quickly, we can just mask all async exceptions:canonicalizePathSafe x = mask_ $
canonicalizePath x `catch` \(_ :: SomeException) -> return x
We are now guaranteed that catch will not receive an async exception. Unfortunately, if canonicalizePath takes a long time, we might delay Ctrl-C unnecessarily.
Alternatively, we can catch only non-async exceptions:canonicalizePathSafe x = catchJust
(\e -> if async e then Nothing else Just e)
(\_ -> return x)
async e = isJust (fromException e :: Maybe AsyncException)
We use catchJust to only catch exceptions which aren't of type AsyncException, so UserInterrupt will not be caught. Of course, this actually avoids catching exceptions of type AsyncException, which is only related to async exceptions by a partial convention not enforced by the type system.
Finally, we can catch only the relevant exceptions:canonicalizePathSafe x = canonicalizePath x `catch`
\(_ :: IOException) -> return x
Unfortunately, I don't know what the relevant exceptions are - on Windows canonicalizePath never seems to throw an exception. However, IOException seems like a reasonable guess.How to robustly deal with UserInterrupt?
I've showed how to make canonicalizePathSafe not interfere with UserInterrupt, but now I need to audit every piece of code (including library functions I use) that runs on the main thread to ensure it doesn't catch UserInterrupt. That is fragile. A simpler alternative is to push all computation off the main thread:import Control.Concurrent.Extra
ctrlC :: IO () -> IO ()
ctrlC act = do
bar <- newBarrier
forkFinally act $ signalBarrier bar
either throwIO return =<< waitBarrier bar
main :: IO ()
main = ctrlC $ ... as before ...
We are using the Barrier type from my previous blog post, which is available from the extra package. We create a Barrier, run the main action on a forked thread, then marshal completion/exceptions back to the main thread. Since the main thread has no catch operations and only a few (audited) functions on it, we can be sure that Ctrl-C will quickly abort the program.
Using version 1.1.1 of the extra package we can simplify the code to ctrlC = join . onceFork.What about cleanup?
Now we've pushed most actions off the main thread, any finally sections are on other threads, and will be skipped if the user hits Ctrl-C. Typically this isn't a problem, as program shutdown automatically cleans all non-persistent resources. As an example, ghcid spawns a copy of ghci, but on shutdown the pipes are closed and the ghci process exits on its own. If we do want robust cleanup of resources such as temporary files we would need to run the cleanup from the main thread, likely using finally.Should async exceptions be treated differently?
At the moment, Haskell defines many exceptions, any of which can be thrown either synchronously or asynchronously, but then hints that some are probably async exceptions. That's not a very Haskell-like thing to do. Perhaps there should be a catch which ignores exceptions thrown asynchronously? Perhaps the sync and async exceptions should be of different types? It seems unfortunate that functions have to care about async exceptions as much as they do.Combining mask and StackOverflow
As a curiosity, I tried to combine a function that stack overflows (using -O0) and mask. Specifically:main = mask_ $ print $ foldl (+) 0 [1..1000000]
I then ran that with +RTS -K1k. That prints out the value computed by the foldl three times (seemingly just a buffering issue), then fails with a StackOverflow exception. If I remove the mask, it just fails with StackOverflow. It seems that by disabling StackOverflow I'm allowed to increase my stack size arbitrarily. Changing print to appendFile causes the file to be created but not written to, so it seems there are oddities about combining these features.Disclaimer
I'm certainly not an expert on async exceptions, so corrections welcome. All the above assumes compiling with -threaded, but most applies without -threaded.
What are the most significant additions to the language in the last 15 years?submitted by thecity2
[link] [17 comments]
Composite Replicated Data Types
Alexey Gotsman and Hongseok Yang
Modern large-scale distributed systems often rely on eventually consistent replicated stores, which achieve scalability in exchange for providing weak semantic guarantees. To compensate for this weakness, researchers have proposed various abstractions for programming on eventual consistency, such as replicated data types for resolving conflicting updates at different replicas and weak forms of transactions for maintaining relationships among objects. However, the subtle semantics of these abstractions makes using them correctly far from trivial.
To address this challenge, we propose composite replicated data types, which formalise a common way of organising applications on top of eventually consistent stores. Similarly to a class or an abstract data type, a composite data type encapsulates objects of replicated data types and operations used to access them, implemented using transactions. We develop a method for reasoning about programs with composite data types that reflects their modularity: the method allows abstracting away the internals of composite data type implementations when reasoning about their clients. We express the method as a denotational semantics for a programming language with composite data types. We demonstrate the effectiveness of our semantics by applying it to verify subtle data type examples and prove that it is sound and complete with respect to a standard non-compositional semantics