Hello ladies and gents,
I am completely new to programming in general and I decided to take a crack at Haskell for my first language. I only got through the first part of "Learn You a Haskell for Great Good!" and I like how intuitive this language is. With that being said, I can see that learning Haskell or any language for the matter takes up a lot of time. No, I'm not going to ask if Haskell is better than C++ or Python. I understand that each language has benefits and drawbacks. Instead, I want to know if it can do what I need more efficiently than other languages.
So here is something I would be interested in: Get a list of all of the equities currently trading in the US. From there find the past 5 years of trailing twelve months ratios like: P/BV, EV/EBITDA, P/EBIT, P/CFO, and P/FCF. Visualize the distributions of the ratios at any given point in time. Pick any security on the list and get a percentile ranking for ratios at any given point in time. For example: Ford is at the 70% cumulative percentile for P/BV at Jan 2014 and 40% cumulative P/CFO at Jan 2014. Visualizing these ratio distributions will give me a better way to compare securities and see how the general market is acting.
The situation described above is just one of the many ways I want to use a programming language for financial applications. Since my winter break is coming up soon I want to devote most of it towards getting good at programming. I'm not afraid of a steep learning curve. I am also not looking to get hired in finance based on my programming abilities (there are none as of now). Instead, I just want to experiment with whatever language I learn to help me with finance or anything else really. So, is Haskell right for me? Is there another language that will help me do what I want more efficiently?
About me: Things I know a little bit about that I will use a language to answer questions in: econometrics, statistics, accounting, financesubmitted by dandraper15
[link] [59 comments]
Ever since ICFP 2014 I’ve had as a goal to get into the habit of coding in Haskell. It’s been the language I enjoy most for a few years now, but being surrounded by and talking to so many brilliant developers as I did during that week really drove home that I will only have more fun the more code I write. My goal was not very ambitious; just write something in Haskell most days every week. So far I’ve managed to keep it up.
These are a few tricks I’ve used and they’ve worked well for me so far.Just write, no matter what, just write
In ninth grade a rather successful Swedish author visited my school and what I remember most from that is one thing he said:
Just read! It doesn’t matter what. It doesn’t matter if what you read isn’t considered good literature; read Harlequin books if that’s what you like, read magazines, read comics. Just read!
I think the same holds for writing code; it’s only with practice that one gets comfortable expressing oneself in a particular language.Fix warts
I can’t actually think of any piece of code I’ve written that doesn’t have some warts. It may be in the form of missing features, or quirks (bugs) in the implementation that forces the user to regularly work in a less-than-optimal way. I’ve found fixing warts in tools and libraries I use myself to be one of the most rewarding tasks to take on; the feedback is so immediate that every fix cause a boost in motivation to fix the next one.Exercise sites
Sometimes it’s simply difficult to find the motivation to tackle working on an existing project, and inspiration for starting something new might be lacking too. This happens to me regularly, and I used to simply close the lid on the computer earlier, but now I try to find some exercises to do instead.
There are several sources of exercises. I know Project Euler is rather popular among new Haskellers, but there are others.
- CodeEval is a site with problems in three different levels. It may be extra interesting for people in the US since some of the problems are sponsored by companies which seem to use the site as a place for recruiting. So far I’ve only seen American companies do that, but I suppose it might catch on in other parts of the world too. Haskell is one of several languages supported.
- Exercism is both a site and a tool. The goal is to facilitate learning of languages. On first use the tool will download the first exercise, and after completion one uses it to upload the solution to the site. Once uploaded the solution is visible to other users, and they are allowed to “nitpick” (comment). After uploading a solution to one exercise the next exercise in the series becomes available. It supports a rather long list programming languages.
I like both of these, but I’ve spent more time on the latter one. Personally I find the idea behind Exercism very appealing and I’ve been recommending it to a couple of co-workers already.
Feel free to put links to other sources of exercises in the comments.Simplify old code
With more practice comes more and more insights into what functions are available and how to string them together. When I don’t even feel like doing a full exercise on Exercism I just dig out something that smells a little and clean it up. Anything is fair game, no matter how tiny. Just take a look at my implementation of reThrowError.What else?
I’d love to hear tips and tricks from other people who aren’t lucky enough to have a day job where they get to write Haskell. How do you keep up the learning and practice?
My code is:data BTree a = Bleaf a | Bnode a (BTree a) (BTree a) deriving(Show) class IsTree b => IsMarkedTree b where mark :: BTree b -> b instance IsMarkedTree (BTree b) where mark (Bnode a _ _) = a mark (Bleaf a) = a
The error is:no instance for (IsMarkedTree Integer) arising from a use of mark ...
Could someone explain how I can fix it?
Thanks in advance!submitted by echo-the-crat
[link] [17 comments]
While waiting for LambdaCube to reach a stable state I started thinking about displaying text with it. I wanted to build a system that doesn’t limit the user to a pre-determined set of characters, so all the static atlas based methods were out of the question. From experience I also know that a dynamic atlas can fill very quickly as the same characters need to be rendered to it at all the different sizes we need to display them in. But I also wanted to see nice sharp edges, so simply zooming the texture was out of the question.What’s out there?
Short of converting the characters to triangle meshes and rendering them directly, the most popular solution of this problem is to represent them as signed distance fields (SDF). The most commonly cited source for this approach is a well-known white paper from Valve. Unfortunately, the problem with SDF based rendering is that it generally cannot preserve the sharpness of corners beyond the resolution of the texture:
The above letter was rendered at 64 pixels per em size. Displaying SDF shapes is very simple: just sample the texture with bilinear interpolation and apply a step function. By choosing the right step function adapted to the scale, it’s possible to get cheap anti-aliasing. Even simple alpha testing can do the trick if that’s not needed, in which case rendering can be done without shaders. As an added bonus, by choosing different step functions we can render outlines, cheap blurs, and various other effects. In practice, the distance field does an excellent job when it comes to curves. When zoomed in, the piecewise bilinear nature of the curves becomes apparent – it actually tends to look piecewise linear –, but at that point we could just swap in the real geometry if the application needs higher fidelity.
Of course, all of this is irrelevant if we cannot afford to lose the sharp corners. There are some existing approaches to solve this problem, but only two come to mind as serious attempts: Loop and Blinn’s method (low-poly mesh with extra information on the curved parts) and GLyphy (SDF represented as a combination of arc spline approximations). Both of these solutions are a bit too heavy for my purposes, and they also happen to be patented, which makes them not so desirable to integrate in one’s own system.
The Valve paper also points towards a possible solution: corners can be reconstructed from a two-channel distance field, one for each incident edge. They claimed not to pursue that direction because they ‘like the rounded style of this text’ (yeah, right!). Interestingly, I haven’t been able to find any follow-up on this remark, so I set out to create a solution along these lines. The end result is part of the LambdaCube Font Engine, or Lafonten for short.Doubling the channels
Valve went for the brute-force option when generating the distance fields: take a high-resolution binary image of the shape and for each pixel measure the distance to the nearest pixel of the opposite colour. I couldn’t use this method as a starting point, because in order to properly reconstruct the shapes of letters I really needed to work with the geometry. Thankfully, there’s a handy library for processing TrueType fonts called FontyFruity that can extract the Bézier control points for the character outlines (note: at the time of this writing, the version required by Lafonten is not available on Hackage yet).
The obvious first step is to take the outline of the character, and slice it up along the corners that need to be kept sharp. These curve sections need to be extruded separately and they need to alternate between the two channels. If there’s an odd number of sections, one segment – preferably the longest one to minimise artifacts – can gradually transition from one channel to the other:
The extrusion is performed in both directions, so the real outline of the character is the median line of the extruded sections. Afterwards, the internal part needs to be filled somehow. It turns out that simply using the maximum value in both channels yields good results:
The shape used for filling is created from the inner edges of the extruded curve sections. The convex corners are derived as intersections between the consecutive curves, while the concave corners need to be covered more aggressively to make sure there are no dark holes within the contour (not to mention that those curves don’t necessarily intersect if the angle is sharp enough).
Another transformation that turned out to be useful to avoid some undesired interaction between unrelated curves is an extra cutting step applied at obtuse convex angles. For instance, the bottom left part of the 3 above actually looks like this:
It might be the case that this cutting step is redundant, as it was originally introduced while I was experimenting with a different, more brittle channel selection scheme.
Unfortunately, this two-channel distance field doesn’t contain enough information to reconstruct the original shape. The problem is that depending on whether a corner is convex or concave we need to combine the fields with a different function to get an approximation. Convex corners are defined as the minimum of the two values, while concave ones are the maximum. When we render the above image on a low-res texture and sample it linearly, we cannot really determine which case we’re looking at without additional data.Quadrupling the channels
After some experimentation, I came to the conclusion that I needed two additional channels to serve as masks. The final formula to calculate the approximate distance is the following:
max(max(d1 * m1, d2 * m2), min(d1, d2)).
The values of d1 and d2 come from the channels shown above, while m1 and m2 depend on what colour section we are in. If we’re in a section approximated by d1, then m1 = 1 and m2 = 0. For d2, it’s the opposite. The values of m1 and m2 can be obtained by thresholding the two extra channels.
The new channels can be thought of as weights, and they explicitly mark which distance channel is relevant in a given position. This is what they look like for the same character:
The trickiest part is around the corners: we need to make sure that the weights kick in before the distance from the other channel disappears, but not too soon, otherwise the current channel would interfere with the other side of the corner. There’s a lot of hand tuning and magic constants involved, because I couldn’t find the time yet to derive a proper formula in a principled manner, but it handles most situations well enough already. It’s also obvious that the weights are erroneously low in the concave corners, but somehow this bug never seems to lead to visible errors so I haven’t fixed it yet.
One way to visualise the interaction of the four channels is thresholding them and adding the corresponding weights and distances:
The left hand side shows the result of thresholding and adding the channels. The right hand side is the same image, but the green channel is filled with the result of evaluating the reconstruction formula. This visualisation also shows how delicate the balance is around convex curvatures when generating the distance fields with this approach. Fortunately it does the trick!One last step towards happiness
Unfortunately, this is not the end of the story. When the above geometry is rendered at a small resolution, we get some new artifacts partly from rasterisation, partly from linear interpolation in certain cases. The first problem is caused by the fact that the corner vertices of the fill geometry are not shared with the outlines, just derived with a formula, so there’s no guarantee that the rasteriser will fill all the pixels. The second issue can happen when inner edges of the opposite channels touch. When this happens, interpolation is performed between opposite values that both represent inner areas, but their average is so low that it’s interpreted as an outer area:
As it turns out, both issues can be resolved in a single post-processing pass that detects these specific situations and patches up the texture accordingly. All-maximum pixels are spread to adjacent all-minimum pixels, and directly facing opposite channels also unified with the max function.The bottom line
So was the final outcome worth the trouble? I believe so. The resolution necessary to correctly represent different characters highly depends on the font. Fonts with small details (e.g. small serifs) obviously require a finer resolution to reproduce with high fidelity. For instance, Times New Roman needs about 128 pixels per em, which is probably too much for most purposes. On the other hand, Droid Sans is fine with just 64 pixels per em, and Droid Serif with 72. As an extreme example, Ubuntu Regular renders nicely from just 40 pixels per em. In any case, the improvement over the simple distance field of the same resolution is quite spectacular:
In the end, I achieved the original goal. The characters are added to the atlas dynamically on demand, and they can be rendered at arbitrary sizes without getting blurry or blocky in appearance. At the moment the baking step is not very optimised, so it’s not possible to generate many characters without introducing frame drops. But it’s fast enough to be used on a loading screen, for instance.What next?
I’m surprised how well this method works in practice despite the ad hoc approach I took in generating the distance field. However, I’m missing the robustness of the simple distance field, which doesn’t break down in surprising ways no matter what the input shape is. One alternative I’d like to explore involves a very different way of filling the channels. Instead of rendering the geometry directly as described above, I’m considering the use of diffusion curves or generalised Voronoi diagrams. Since the resolution needed to reproduce individual characters is not terribly high, it could be okay to do the baking on the CPU with a dumb brute-force algorithm at first. This would also make it possible to generate characters in a background thread, which would be ideal for games.
Subtyping is one of those concepts that seems to makes sense when you first learn it (“Sure, convertibles are a subtype of vehicles, because all convertibles are vehicles but not all vehicles are convertibles”) but can quickly become confusing when function types are thrown into the mix. For example, if a is a subtype of b, is (a -> r) -> r a subtype of (b -> r) -> r? (If you know the answer to this question, this blog post is not for you!) When we asked our students this question, invariably some were lead astray. True, you can mechanically work it out using the rules, but what’s the intuition?
Maybe this example will help. Let a be tomatoes, and b be vegetables. a is a subtype of b if we can use an a in any context where we were expecting a b: since tomatoes are (culinary) vegetables, tomatoes are a subtype of vegetables.
What about a -> r? Let r be soup: then we can think of Tomato -> Soup as recipes for tomato soup (taking tomatoes and turning them into soup) and Vegetable -> Soup as recipes for vegetable soup (taking vegetables—any kind of vegetable—and turning them into soup). As a simplifying assumption, let's assume all we care about the result is that it’s soup, and not what type of soup it is.
What is the subtype relationship between these two types of recipes? A vegetable soup recipe is more flexible: you can use it as a recipe to make soup from tomatoes, since tomatoes are just vegetables. But you can’t use a tomato soup recipe on an eggplant. Thus, vegetable soup recipes are a subtype of tomato soup recipes.
This brings us to the final type: (a -> r) -> r. What is (Vegetable -> Soup) -> Soup? Well, imagine the following situation...
One night, Bob calls you up on the phone. He says, “Hey, I’ve got some vegetables left in the fridge, and I know your Dad was a genius when it came to inventing recipes. Do you know if he had a good soup recipe?”
“I don’t know...” you say slowly, “What kind of vegetables?”
“Oh, it’s just vegetables. Look, I’ll pay you back with some soup, just come over with the recipe!” You hear a click on the receiver.
You pore over your Dad’s cookbook and find a tomato soup recipe. Argh! You can’t bring this recipe, because Bob might not actually have tomatoes. As if on cue, the phone rings again. Alice is on the line: “The beef casserole recipe was lovely; I’ve got some tomatoes and was thinking of making some soup with them, do you have a recipe for that too?” Apparently, this happens to you a lot.
“In fact I do!” you turn back to your cookbook, but to your astonishment, you can’t find your tomato soup recipe any more. But you do find a vegetable soup recipe. “Will a vegetable soup recipe work?”
“Sure—I’m not a botanist: to me, tomatoes are vegetables too. Thanks a lot!”
You feel relieved too, because you now have a recipe for Bob as well.
Bob is a person who takes vegetable soup recipes and turns them into soup: he’s (Vegetable -> Soup) -> Soup. Alice, on the other hand, is a person who takes tomato soup recipes and turns them into soup: she’s (Tomato -> Soup) -> Soup. You could give Alice either a tomato soup recipe or a vegetable soup recipe, since you knew she had tomatoes, but Bob’s vague description of the ingredients he had on hand meant you could only bring a recipe that worked on all vegetables. Callers like Alice are easier to accommodate: (Tomato -> Soup) -> Soup is a subtype of (Vegetable -> Soup) -> Soup.
In practice, it is probably faster to formally reason out the subtyping relationship than it is to intuit it out; however, hopefully this scenario has painted a picture of why the rules look the way they do.