Xen and Fedora

Submitted by TomMD on Wed, 03/19/2008 - 7:36pm.

So this is a little off topic for, but I'm going to document some of my run-ins with Xen and Fedora here. Also, I am looking for somewhere more generic to move my blog - any suggestions? EDIT: I just decided to go with wordpress, so you can find my ramblings at

Installing Xen
I had some issues with installing/using Xen from source. I mudded the waters at the start by trying out various versions / packagings (Fedora Xen package, xen-3.0.4-testing, xen-3.2-testing, xen-3.2.0) and I discovered that:

  1. 'make clean' will remove ALL xen kernels (rm /boot/*xen*). This eliminated a perfecting fine xen-3.0.4 install when I was screwing around with 3.2-testing.
  2. 'make clean' missed some python scripts - causing the xm tool funky behavior till I manually cleaned /usr/lib and reinstalled Xen.
  3. Fedoras Xen package is fine (currently a Xen-3.1 derivative) , unless you have special desires like ACM - in which case you certainly need to use a source distribution. Good work Fedora!

Installing Fedora PVMs
Using the virt-install tool (package: python-virtinst) it is dead simple to install stable Fedora distros (of coarse the GUI tool, virt-manager, exists too):

# virt-install -n F8 -r 512 --vcpus=1 -f /var/lib/xen/images/F8.img \
--vnc -p -b virbr0 -l ""

But be sure to enable packet forwarding by adding to your /etc/sysctl.conf. If you don't, you'll find the packets won't get forwarded from your bridge (xenbr0 or virbr0) to your network.

# Controls IP packet forwarding
net.ipv4.ip_forward = 1

Or use sysctl or directly alter /proc/sys/net/ipv4/ip_forwarding, if you are rustic like me.

Dealing with Rawhide
I figured I could be a rawhide tester by running it as a DomU - I was right and wrong.

  1. Rawhide can not act as Dom0. It seem that, due to the PV_Ops task, Fedora 9 is not intended as Dom0 - Fedora 8 and 10 will fill the host role.
  2. Rawhide doesn't currently run as a paravirtual machine (PVM - a type of DomU). I've reported bug 437217, which has an obvious fix, and bug 437706, which is where I'm at now.

Refuting Myths about Polymorphism

Submitted by mrd on Sun, 03/02/2008 - 11:23pm.

I am addressing an article written by Mark Dominus named Subtypes and polymorphism:

References and Type Safety

In the second part of the article, Mark points out an example which he says demonstrates where Standard ML breaks type safety. In fact, he is passing along a common myth about the "value restriction" in Standard ML and other Hindley-Milner based languages. To state it quite simply up-front: the "value restriction" is not necessary to protect type safety. It is simply there to preserve an intuition about how code evaluates.

This is the example he presents. If you give this to a Standard ML compiler, it will fail on the line beginning with val a because of the "value restriction."

fun id x = x (* id : α → α *) val a = ref id; (* a : ref (α → α) *) fun not true = false | not false = true ; (* not: bool → bool *) a := not; (!a) 13

If the value restriction were to be lifted, the code would still be type-safe. Mark's interpretation of the code is incorrect. But the thinking process which led him to his incorrect interpretation is itself the reason for the value restriction.

The Value Restriction

The restriction itself is simple to describe: if you have code that looks like this:

val x = ...

and the ... has a type involving polymorphism, then it must be a value syntactically.

ref id of type (α → α) ref is not a value -- it will evaluate to a location-value. Also, it is polymorphic. Therefore the value restriction applies, and the code is not permitted.

The value restriction is seemingly helping us preserve type safety. But that is a misconception. If you removed the value restriction, you could compile the above code, and it would still be type-safe.

The short answer to what would happen is this: the a on the line a:=not and other one on the line (!a) 13 are not evaluating to same location-value! Therefore, you are not, in fact, applying not to an integer. You are applying id to an integer, which is perfectly fine.

Under the hood

To explain why this is the case, I need to show you that the code above is omitting important details about polymorphism. Those details are automatically reconstructed and inserted by the typechecker. Here is a version of the code that is decorated with the details after typechecking:

val id : ∀α. α → α = Λα. fn (x:α) => x val a : ∀α. (α → α) ref = Λα. (ref{α → α}) (id{α}) val not : bool → bool a{bool} := not; ((!{int → int}) (a{int})) 13

Whew! Now you can see why people prefer to let the typechecker deal with this stuff. But now the Hindley-Milner style types have been translated into a System F (aka polymorphic lambda-calculus) style language.

When you see the type α → α in H-M types, it's omitting the universal quantifier. Haskell folks are already familiar with it, because a popular extension allows the explicit use of ∀ (aka "forall"). All the type variables must be quantified in a ∀ placed at the beginning of the type.

The term which has a type beginning with ∀ is called Λ (aka "type-lambda"). Therefore, this term must be introduced everywhere a ∀ appears in the type. The typechecker rewrites your code to insert the necessary type-lambdas.

In order to use a polymorphic value (a term wrapped in a type-lambda), you must apply a type to the type-lambda. This is denoted t{T} in the above example. For example, to use the polymorphic function id : ∀a. α → α, you must first apply its type-lambda to the type that you want: (id{int}) 13 steps to 13.

Now, I can explain what is happening behind the scenes when you use a. Since a is polymorphic, that means it is wrapped in a type-lambda. In order to be used as a reference, it must be applied to a type. For example:

a{bool} steps to (ref{bool → bool}) (id{bool}) which then steps to l1

where l1 of type (bool → bool) ref is a location-value –- the result of evaluating a call to ref.

On the other hand, later on we do:

a{int} steps to (ref{int → int}) (id{int}) which then steps to l2

where l2 is a different location-value. We've called ref a second time, and it's given us a different location.

The Unexpected

When Mark wrote val a = ref id he clearly did not expect ref to be called more than once. But I have demonstrated that is exactly what would happen if his code were to be compiled and run.

This kind of behavior is type-safe; we never attempt to invoke not on the integer 13 because the function not is safely stored in an entirely different location. It is possible to go ahead and prove the type safety of a language like this with references. But clearly, the repeated computation of ref is surprising, and might even be said to be signs of a "leaky abstraction" in the Hindley-Milner type system. It attempts to hide the details about type-lambdas, but they've revealed themselves in the dynamic behavior of the code.

The "value restriction" then, is simply a measure put in place to prevent "surprising" behavior from occurring in the first place. By allowing only values syntactically, you will not end up with any evaluation after applying the type-lambda, and all is as expected.

Subtyping and Polymorphism

Submitted by mrd on Sun, 03/02/2008 - 9:54pm.

I am addressing an article written by Mark Dominus named Subtypes and polymorphism:

There are two parts to this article, the first discusses the fact that subtyping and polymorphism can lead to some unintuitive results. I won't repeat his point here, but just note that the term for the behavior of "List" here is called "invariant" and the incorrect behavior which he had assumed is called "covariant."

Java's broken arrays

While Java generic Lists are "invariant," Java arrays are "covariant." This is not due to some wonderful property of arrays, but in fact is a gigantic flaw in the Java language which breaks static type safety. As a result, arrays must carry run-time type information in order to dynamically check the type of objects being inserted. Consider this code:

String stringA[] = new String[1]; stringA[0] = "Hello"; Object objectA[] = stringA;

So far this seems okay -- String is a subtype of Object -- therefore it seems safe to say that String arrays are a subtype of Object arrays. This is called "covariant" behavior.

So if objectA is an array of Objects, and everything is a subtype of Object, then it should be okay to put an Integer there instead, right?

objectA[0] = 1;

but wait, now stringA[0] has an Integer in it! A String array cannot have an Integer in it. Before it seemed that arrays might be "covariant," but mutation makes it seem that arrays must be "contravariant." We can't be both at the same time, so it seems mutable arrays are "invariant" after all.

If you try running the code given above, you get:

Exception in thread "main" java.lang.ArrayStoreException: java.lang.Integer

clearly, a run-time type-check is being performed.

About Covariance, Contravariance, and Invariance

I've been using these terms, but haven't really defined them.

A type-constructor C is covariant iff: for all types S,T, if S is a subtype of T then C<S> is a subtype of C<T>.

Similarly, for contravariance, if S is a subtype of T then C<T> is a subtype of C<S>. Note that S and T have swapped.

Invariance (the name is confusing, I know) is the lack of covariance or contravariance.

It may be of interest to know that functions are contravariant in the input, but covariant in the output. Perhaps that is why the Java folks are not so eager to have function types!

Broken arrays and generics

The brokenness of Java arrays has impacted on the design of Java generics as well. Consider the following code that might be written if you were designing your own generic container of some sort:

public class Container<T> { private T[] a;

So far so good, right? An array of some parameterized type seems reasonable.

public Container() { a = (T[])new T[10]; } }

Uh oh! generic array creation a = (T[])new T[10]; ^ Note: uses unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. 1 error

No can do. Java is jumping down my throat over this seemingly reasonable code. Why? Generic arrays are not permitted like this, because -- as described above -- Java arrays must carry run-time type information. But Generics promise total type erasure -- meaning NO run-time type information is kept. Sounds like they designed themselves into a corner when creating Java, and later, Generics.

Polymorphism is not Subtyping

Returning to the second part of the article, I want to dispel a few common misperceptions about polymorphism.

Mark Dominus: Then we define a logical negation function, not, which has type bool → bool, that is, it takes a boolean argument and returns a boolean result. Since this is a subtype of α → α, we can store this function in the cell referenced by a.

I want to make it absolutely clear that bool → bool is not a subtype of α → α. There is no notion of subtyping in basic Hindley-Milner, and it is still not the case even in a "loose" sense.

The easiest way to verify this for yourself is to find your nearest Standard ML (or similar) prompt and type this:

not : 'a -> 'a

and get a type error. not : bool → bool and it is not of type α → α.

Now, his example does not typecheck in Standard ML because of the "value restriction." The value restriction does not exist to preserve type safety at all, but only to prevent programmer confusion.

This post has gone on too long about Java, so I am afraid it will wash out the more important points about polymorphism and the value restriction. I will leave it then, to the next post.

Do arrows, monads, monad transformers clash when a large app needs libraries which use each of these?

Submitted by metaperl on Mon, 02/25/2008 - 12:59pm.

Now for my question

from my neophyte perspective, Clean has a single easy-to-use mechanism for state maintenance - uniqueness typing.

Haskell has monads.

But then it also has arrows now.

And so there are two XML products, one based on arrows and one based monads.

When one is developing a large project in Haskell, what does one do if certain libraries are done in arrows and others in monads and you need to use both? And what about monad-transformers? Is that not yet a 3rd complication?

A Java programmer only has objects to handle any concept. So the idea of one library doing one thing and another doing another does not come up.

So, is there a difficulty in large application development because of the clashing advanced functional metaphors of arrow, monad, monad transformer?

Let's face it. There are very few large-scale products shipped in Haskell. Period. I dont know how good that is. or why that is.

I suppose Galois is happy with haskell, but the lack of large useable apps make you wonder if something like what I mention is getting in the way.

About my level and interest haskell

I have been through SJT's book. Been through the Haskell Road. Been through Sethi's "Algorithms". Like Bird and Wadler's intro to FP.

Now, I first got interested in Haskell in 2003. I love haskell. It is profoundly elegant. I never quite got over the hump into making it useful... 8 years as a Perl developer spoils you. But as a Perl developer, I think very functionally and studying Haskell has helped me immensely. But i always run back to my crutches when the game gets tight.

An Exersize in Over-Engineering

Submitted by TomMD on Wed, 02/20/2008 - 9:44pm.


Inspired by an old article on over engineering I decided to try and over design / bloat an event system (aka a timeout system). This is the same system I posted about earlier, but with fewer features and many more LOC.


The initial concept was clean, simple. It would handle hundreds of events, be useful for 80% of the cases (I do like the 80% rule). My event system state was this:

data EventState = EvtState { esEvents :: [Event], esAlarm :: ClockTime, esNextId :: EventId }

Each new event gets a new Id number, the 'NextId' is incremented, and the event is added to the time ordered list, 'esEvents'. 'esAlarm' is updated as necessary (atomically by the thread adding the event with an earlier expire time).

Functionally, there are 2 threads in the system and 'n' threads of event producers. One thread, eventExpire, would threadDelay until the soonest event needs to be expired. The other thread, watchAlarm, would keep an eye on esAlarm - if it was changed to an earlier time then an exception would be thrown to the eventExpire thread.

Theoretical Bugs!

Oh no! You could theoretically overflow the Int used to express EventId. I could use Integer, but that still has a theoretical limit and the memory used for ids would grow (eventually). So, I set off to redefine EventId from an Int to a (ClockTime,Int) combo. How do I know this doesn't overflow? Because my new state field which keeps track of the next event Id for each ClockTime (less the expired times), and if it does overflow for a given clock time I refuse to add the event.

Realistic Performance Issues

That's right! A close runner up for theoretical bugs is fixing performance. Why is this slow for large numbers of events? Well, adding events is O(n), expiring events can are in order in the list, and updating the alarm time is O(1).

Lets turn _everything_ into a mapping. This way most operations of interest are ~O(ln(n)) The new state will look like this:

type EventSet = (EventNumber, Map EventNumber (IO ())) data EventSystem = EvtSys { esEvents :: TVar (Map ClockTime EventSet), -- Pending Events esThread :: TVar (Maybe ThreadId), -- Thread for TimerReset exceptions esAlarm :: TVar ClockTime, -- Time of soonest event esNewAlarm :: TVar Bool, -- new/earlier alarm }

What is this in English? esEvents is a mapping of ClockTimes to a mapping of EventNumbers to IO Actions (and the next unique EventNumber is in there). Also, there is now esNewAlarm, so the alarmWatch can avoid throwing expires unnecessarily.

All in all this does OK, 3 seconds to add and execute 100,000 actions (vs a previous 38 seconds for 10,000).

Fixing Dropped Events

Now we have eliminated theoretical bugs and improved performance - whats left? Fixing real bugs of coarse! When running the test code (shamelessly stolen from control-timeout - thanks Adam!) I found some tests never finished and no more events were pending. The cause seems to be that between the event list being pruned of old events and those events being executed the thread was getting an exception (a new event was added with an earlier alarm). The solution is yet another thread and TVar (that's right, I've an irrational dislike of 'block'). Old events are atomically pruned and written to an 'expired' list - then the third thread, monitorExpiredQueue, can zero the queue and execute the actions. My new state has the one more TVar:

esExpired :: TVar [[EventSet]]

A Useful API

So now that I've finished all this, and we are two or three days after my original post, I need to make the API something useful. Current code looks like:

sys <- initEventSystem (TOD sec ps) <- getClockTime eid <- addEvent sys (TOD (sec+delay) ps) evt ... if dontNukeThem then cancelEvent sys eid else return () -- explosives concept borrowed from recent -cafe example

Adam had a better idea for most use cases in that they work based on deltas. So I wrote a shim to provide the Control.Timeout API with the Control.Event engine (Control.Event.Timeout).

eid <- addTimeout act delay if dontNukeThem then cancelTimeout eid else return ()
And an informal performance comparison is:
   Control.Event can add 100K events in 3  seconds
   Control.Timeout       100K           12 seconds
   Control.Event.Timeout 100K           58 seconds

Control.Event.Timeout calls getClockTime on every event add, so driving the Control.Event code with time deltas is really slow. Control.Timeout is obviously very good with time deltas.

Control.Event Released

Submitted by TomMD on Sat, 02/16/2008 - 11:22am.

I've released control-event (on, which provides a system to schedule and cancel IO events for given System.Time.ClockTimes. While designed prior to the release of Control.Timeout, it is very similar, the main differences are:

  1. Control.Event requires initilization while Control.Timeout hides this via unsafePerformIO.
  2. Control.Event can add events in the STM monad using addEventSTM. While Control.Timeout has an addEventAtomic, they aren't drop in replacements for each other (addEventAtomic is an IO action that returns an STM operation).
  3. Control.Timeout externally interfaces by time deltas (Float) while Control.Event is based on absolute ClockTimes.
  4. Control.Timeout uses only one Haskell thread while Control.Event needs two.
  5. Control.Event will not duplicate EventIds once the counter overflows. The possibility of this happening in Control.Timeout is entirely theoretical - I'm not claiming there is a practical difference.

Future Changes:

  • I'll either expand or eliminate the extra features of Event Preprocessing.
  • If no one uses it, I'll probably be getting rid of the ability to run expired actions in the event system thread.
  • I'm guessing Data.Time works on Windows. So long as I don't learn otherwise, I'll move from the old-time package to the 'time' package.
  • I'll add a shim to allow adding events using time deltas.

EDIT: I should add that Control.Timeout performs MUCH better.

Let me know if you find this useful!

Converting a ByteString to a Double using FFI

Submitted by mrd on Tue, 12/18/2007 - 3:36pm.

ByteString, the ever-useful fast string library, has some nice functions for reading Ints which make reading in test data a breeze. However, there's no similar function for Doubles which has caused some recent troubles for me when dealing with floating point data. The C library has a function for this task: strtod. So I decided to take some older but similar FFI code and mold it into an interface to strtod.

{-# LANGUAGE ForeignFunctionInterface #-}
import Foreign
import Foreign.C.Types
import Data.ByteString.Base

foreign import ccall unsafe "stdlib.h strtod"
c_strtod :: Ptr Word8 -> Ptr (Ptr Word8) -> IO CDouble

strtod :: ByteString -> Maybe (CDouble, ByteString)
strtod str = inlinePerformIO .
withForeignPtr x $ \ p ->
alloca (\ p_endptr -> do
let p_s = p `plusPtr` s
n <- c_strtod p_s p_endptr
endptr <- peek p_endptr
let diff = endptr `minusPtr` p_s
if diff == 0
then return Nothing
else return $ Just (n, PS x (s + diff) (l - diff)))
where (x, s, l) = toForeignPtr str

The type is chosen in particular to be convenient for usage with unfoldr.

The code probably looks more complicated than it is. Since ByteStrings are really stored as ForeignPtr Word8 (along with start-offset and length), it is easy to grab the raw form for usage by C functions. withForeignPtr lets you manipulate the raw Ptr Word8 within a function. Pointer arithmetic is performed using functions like plusPtr. The second parameter to strtod is actually an "out" parameter which sets a pointer to the spot it finished reading. I allocate space to store a pointer, namely a Ptr (Ptr Word8) and Haskell can figure out how much space it needs from the inferred type. I peek to retrieve the "returned" end-pointer. Then it's just a matter of putting back together the ByteString (PS for "Packed String") with the new offset and length.

QWERTY makes a comeback on Dvorak

Submitted by metaperl on Fri, 12/14/2007 - 7:17am.

If you take a look at handheld computers and/or PDAs, GPSs, etc. you will note that the keys are quite close together.

This is where QWERTY is a good thing. It was a typing system designed to keep subsequent keystrokes from being very close to each other.

This is great for these tiny keyboards... with dvorak, I imagine your fingers would be bumping into each other.

Data Parallel Bellman-Ford

Submitted by mrd on Thu, 11/29/2007 - 7:48pm.

Graph representation is as an edge-list.

bmf :: Int -> UArr (Int :*: Int :*: Double) -> Int -> UArr Double
bmf n es src = iterate rnd dm0 !! (n - 1)
dm0 = toU [ if i == src then 0 else inf | i <- [0 .. n - 1] ]
rnd dm =
updateU dm
(mapU (\ e ->
if distOf dm (destin e) > distOf dm (source e) + weight e
then (destin e) :*: (distOf dm (source e) + weight e)
else (destin e) :*: (distOf dm (destin e)))

source = fstS . fstS
destin = sndS . fstS
weight = sndS
distOf dm u = dm !: u

inf :: Double
inf = 1 / 0

The above code uses NDP but only the sequential portions. In order to get parallelism I need to invoke the Distributed operations. However, there are also Unlifted.Parallel operations which hide the usage of the distributed ops.

bmf :: Int -> UArr (Int :*: Int :*: Double) -> Int -> UArr Double
bmf n es src = iterate (rnd es) dm0 !! (n - 1)
dm0 = toU [ if i == src then 0 else inf | i <- [0 .. n - 1] ]

{-# INLINE rnd #-}
rnd :: UArr (Int :*: Int :*: Double) -> UArr Double -> UArr Double
rnd es dm = updateU dm
. mapUP sndS
. filterUP fstS
. mapUP (\ e ->
let d = distOf dm (destin e)
d' = distOf dm (source e) + weight e
in if d > d'
then True :*: (destin e :*: d')
else False :*: (0 :*: 0))
$ es

mapUP is the unlifted parallelized version of mapU. However there's no updateUP. Looking into the code I spotted out a commented out version of updateUP. There are problems when figuring out what to do about multiple concurrent writes to one location. NESL specifies "arbitrary" resolution of conflicting concurrent writes. Unfortunately the commented code has bit-rotted and I haven't successfully managed to fix it.

I also added some filtering to prevent it from getting stuck writing Infinity over a real distance constantly, due to "arbitrary" resolution of conflicting writes in updateU. The iterative function is now factored out into its own toplevel definition, for clarity.

Improving Haskell MD5

Submitted by TomMD on Fri, 11/09/2007 - 7:42pm.

I cleaned up the code!

  • Fixed the ugly 'blocks' function. This results in better performance and fixed the space leak!
  • Changed the license from GPL2 to BSD3.
  • Eliminating the 'finalize' doesn't give any further performance, so I think this was a really odd bug in my code, not GHC.

    Again, the code is at:
    darcs get

    And here is the quick and dirty performance of the unrolled version:

    $ time ./hsMD5 ../2GB

    real 1m22.685s
    user 0m31.730s
    sys 0m4.676s

    $ time md5sum ../2GB
    b8f1f1faa6dda5426abffb3a7811c1fb ../2GB

    real 1m21.343s
    user 0m25.330s
    sys 0m4.528s

    So, at least on my clunker, the Haskell code shows good time on the 2GB test.

    $ time ./hsMD5 ../200MB

    real 0m2.267s
    user 0m2.148s
    sys 0m0.116s

    $ time md5sum ../200MB
    eda9a9889837ac4bc81d6387d92c1bec ../200MB

    real 0m0.775s
    user 0m0.704s
    sys 0m0.068s

    And its just under 3 time slower than C on that one.