[dilinger] trave11er: CosmicRay's weblog hasn't led me astray yet :P
[trave11er] dilinger: i cannot trust people who can write a 100+ line haskell program :-)
[trave11er] that's almost a legal definition of insanity
[Igloo] You're right; Haskell is such a concise, elegant language that anyone who stretches a program beyond 100+ lines is clearly doing something wrong
HTML::Mason (www.masonhq.com) is a Perl module that would make most Haskellers cringe. It is the most confused domain-specific-language you can see: Perl embedded in HTML.
But Mason is in wide use at amazon.com. In addition I just interviewed for a Perl poisition at the largest Oracle database installation in the world. And they are planning to move to Mason for HTML/XML/SOAP delivery... why would they do this? Because Mason can stream HTML data: instead of building up an entire XML document and then sending it, it can stream parts. When the document is _huge_ this becomes crucially important: render some of the xml, ship it off, reclaim the memory and keep on truckin'...
How could a lazy functional language achieve the design goal eagerly streaming data as opposed to lazily producing it on demand?
<CosmicRay> I've also never quite understood the exact mechanics of memory utilization and performance of recursive functions in a lazy language like haskell
<CosmicRay> it seems quite different than in most other fp languages
<shapr> xerox: That would be fun to write.
<CosmicRay> and it seems like some of the rules of thumb ("tail recursion is good", for instance) don't apply
<xerox> shapr, go for it! (as you say ;))
<dustin`> I will write something nifty once I've finished "An Introduction to Higher-Order Categorical Logic". There seems to be a lot of "viewing concepts from category theory through Haskell", but not much "viewing Haskell through category theory"
<JaffaCake> CosmicRay: tail recursion is still good, but you have to watch out for laziness too
<shapr> xerox: I think I'd call that article "Quiver of Artemis" or something.
* xerox takes the vocabulary
<CosmicRay> yeah, that's the bit I don't quite get yet. If I am generating a list and wish to consume it lazily, do I write my function differently? that sort of thing. also I haven't yet quite understood the whole deal with leaving unevaluated values on the heap (there's a word for that, I forget...)
<jlouis> CosmicRay: thunks?
<CosmicRay> that sounds right
<CosmicRay> unless haskell is using "closures" to refer to somthing new, I don't think that's what I mean
<nlv11757_> closures == thunks
<JaffaCake> you probably understand "closure" to mean "function closure"
<JaffaCake> in Haskell a closure doesn't necessarily have to be a function
<JaffaCake> it can be any expression
<Darius> CosmicRay: I wrote StackOverflow that indirectly and partially addresses that issue using the folds as a representative.
<nlv11757_> lots of terms for unevaluated things
<CosmicRay> interesting, so I take it this expanded closure is required for laziness to work?
<Itkovian> he! my powerbook has been shipped!
<Darius> Though in general, hand-evaluating some iterations of a loop should give some ideas about the behavior of the function.
<nlv11757_> you dont evaluate it, just allocate a thunk and evaluate it if needed
<CosmicRay> so let's say I'm writing a recursive function like this:
<nlv11757_> you refer to the memory location of the thunk, so when it is evaluated once, every pointer to that location will suddenly not point to the thunk anymore but the evaluated value
<CosmicRay> myFoo  = 
<nlv11757_> so no re-evaluation is needed
<CosmicRay> myFoo (x:xs) = (x + 5) : myFoo xs
<JaffaCake> i.e. map (+5)
<CosmicRay> right, but I don't know how map works internally, so I wrote it out
<JaffaCake> it works just like that ;)
<CosmicRay> so when I call this function, and consume, say, 2 elements, it only evaluates the first 2 elements, right? (basic laziness)
<nlv11757_> @plugs definition map
-lambdabot/#haskell- Variable not in scope: `definition'
<CosmicRay> and those 2 elements disappear from memory entirely after I'm done with them, I assume
<Darius> CosmicRay: If they are unreferenced, yes.
<JaffaCake> they'll be garbage collected, as long as nothing else references them
<CosmicRay> so now, what if I write it this way:
<CosmicRay> myFoo accum  = accum
<CosmicRay> myFoo accum (x:xs) = myFoo ((x + 5) : accum) xs
<CosmicRay> will this force the entire list to be generated before any output is given?
<CosmicRay> (I do realize this presents the output in a different order)
--> monochrom (trebla@H146.C220.tor.velocet.net) has joined #haskell
<JaffaCake> well, the entire input is consumed before the result is available
<JaffaCake> and the output will be a list of unevaluated closures
<CosmicRay> which will probably make my program not very memory-friendly, right?
<-- rifleman has quit (Connection timed out)
<CosmicRay> or at least less memory-friendly than the first version
<CosmicRay> probably slower too?
<JaffaCake> depends on the demand
<JaffaCake> it might be faster, if you definitely want the entire list
<CosmicRay> hmm, interesting, how could it be faster?
<JaffaCake> because the tails aren't thunks
<JaffaCake> with map, each tail is a thunk, which adds a little overhead
<JaffaCake> time it and see!
<CosmicRay> it's less overhead to have a list of thunks than to have to evaluate a thunk to get the next element in a list?
<CosmicRay> I believe you, I'm just trying to understand why
<JaffaCake> in both cases, the elements will be thunks
<Darius> CosmicRay: map will produce a list of thunks as well as thunks for tails.
<JaffaCake> in the map case only, each tail is a thunk
<CosmicRay> Darius: ahh.
<CosmicRay> Darius: twice the thunkage then, eh?
<CosmicRay> so you have a thunk that returns another thunk as the next element of the list?
<JaffaCake> on the other hand, map deforests nicely....
<jlouis> But map (+5) will return quite a lot faster than myFoo, right?
<CosmicRay> jlouis: than the second version (the one with the accumulator), I'd assume
<jlouis> CosmicRay: yes
<Igloo> It'll return the /first/ element faster
<CosmicRay> this is very interesting.
<jlouis> Igloo: yes
<CosmicRay> the version with the accumulator would usually be the preferred version for other FP languages, esp. if they can do tail recursion optimization (here I'm thinking of ocaml, for instance)
<CosmicRay> since the version written more like the haskell map could consume vast amounts of stack
<Darius> CosmicRay: That's because without laziness, the "map" version in not-tail-recursive.
<JaffaCake> yes - but in Haskell you might prefer the map version for two reaons
<Darius> CosmicRay: Laziness though causes the function to "return" before handling the next element.
<CosmicRay> so it seems that in haskell, tail recursion is only desirable in some narrow cases
<CosmicRay> Darius: right
<JaffaCake> it works with infinite lists
<TFK> The Monad Reader, eh?
<jlouis> I fail to see why myFoo is better than map (+5) if we want the entire list. The overhead in map with forcing the tail is also done in myFoo when we build the list of thunks
<jlouis> what am I missing?
<shapr> TFK: yeah, want to write a TMR article?
<JaffaCake> it consumes constant memory
<JaffaCake> and it deforests nicely
<TFK> I'll get censored :-/
<JaffaCake> *3* reasons :)
<jlouis> JaffaCake: map?
* TFK reads issue #1
<Darius> CosmicRay: Tail-recursion is always desirable, it's just that different functions are tail-recursive or not.
<CosmicRay> what does "deforest" mean?
<CosmicRay> Darius: well it seems here that there are some good reasons to not be tail recursive in haskell
<nlv11757_> folding structures into values
<Darius> jlouis: Most foldr based things
<JaffaCake> deforest == eliminate intermediate structures in a composition
<Darius> CosmicRay: No. The "map" version -is- tail-recursive.
* CosmicRay blinks
<Darius> CosmicRay: It just doesn't look like it.
<JaffaCake> Darius: I don't agree
<CosmicRay> Darius: are you saying that the lazy nature of haskell makes it tail recursive automatically?
<nlv11757_> deforest prevents that first a complete structure is being built, to be broken down again.
<Lunar^> JaffaCake: btw, I have somehow narrowed a concurrency problem with FFI, but I'm still unhappy with these tests, as they don't always works
<monochrom> Deforest means in fold.unfold the intermediate list is optimized away.
<Darius> CosmicRay: In this case, yes, in general no and it can make "tail-recursive" things non-tail-recursive.
<JaffaCake> Lunar^: nice giong
<Darius> JaffaCake: If we wrote map in a strict language with explicit thunks using lambda, one would consider it tail-recursive (trivially), no?
<CosmicRay> monochrom: ah, so sort of like evaluating an equation by substituting values for the unknowns?
<JaffaCake> Darius: no, map isn't tail recursive in any language
<Lunar^> JaffaCake: where should I post them, anyway ?
<TheHunter> jlouis, oh i forgot, i already fixed SeenModule today.
<monochrom> I don't see that analogy. I think it is like fusing two loops.
<JaffaCake> Lunar^: on the bugs list, if it's not too big
--> atom-z (~firstname.lastname@example.org) has joined #haskell
<Igloo> CosmicRay: deforestation means that in map (+1) . map (+2) you never make a list whoe values are 2 more than the input list
<Lunar^> 118 total
<jlouis> TheHunter: Do you have a repository online? I could take a look at it
<Lunar^> JaffaCake: I'll do that then
<Igloo> CosmicRay: You take each value and apply (+2) then (+1) and then return that as a list cell
<TheHunter> jlouis, the 6.2 insertWith must be changed to "insertWith f k e m = FM.addToFM_C (flip f) m k e"
<CosmicRay> ok. I think I understand this bit. thanks everyone. now next question :-) How might seq or strict record fields improve performance?
<-- ex_nor has quit ("Leaving")
<TheHunter> jlouis, and the order of the arguments in SeenModule must be reversed.
<CosmicRay> Igloo: gotcha, thanks
--> exnor (~email@example.com) has joined #haskell
<jlouis> TheHunter: thats it?
<jlouis> Because then I am goin to do that ;)
<skew> CosmicRay: I think the big thing those do is give the strictness analyzer information
<monochrom> In foldl (+) 0 [1..10000], if you use seq somewhere, you will save a lot of stack space.
<TheHunter> jlouis, yeah, that's it.
<jlouis> TheHunter: okie
<CosmicRay> monochrom: what about foldr?
<monochrom> foldr (+) 0 [1..10000] doesn't really benefit from seq.
<monochrom> You use seq when you know the eager strategy beats the lazy strategy.
<skew> monochrom: (foldr (+) 0 [1..10000] :: Int) shouldn't benfit, should it?
<skew> monochrom: foldl, I mean
<JaffaCake> if you use seq in foldl, it becomes a tail-recursive accumulator
<JaffaCake> foldr isn't tail-recursive, so doesn't benefit in the same way
<monochrom> which comes to what skew says. you use seq to help the strictness analyzer to be more aggressive. this eliminates a lot of spurrious thunking.
<TheHunter> jlouis, shall send you an email, so that dons doesn't have to deal with conflicting changes?
<jlouis> TheHunter: firstname.lastname@example.org, please do
<CosmicRay> hmm, so why is there there difference between foldl and foldr in this instance?
<CosmicRay> I thought I could work that out but couldn't quite :-)
<skew> CosmicRay: foldr needs stack space
<monochrom> foldr (+) 0 [1..n] uses Theta(n) space, eager or lazy.
<CosmicRay> since it actually has to begin processing at the right side of the list?
<monochrom> foldl (+) 0 [1..n] uses O(1) space eager, Theta(n) space naive lazy.
<skew> foldr (+) 0 [1,2,3] turns into (1 + (2 + (3 + 0))), which takes linear stack space recursing down the list
<monochrom> no theoretic difference in time complexity.
<thou> is anybody running a recent ghc from CVS? I just updated and built it (trying to get GLUT to work), and it seems OK except ghci can't parse anything (I get ghc-6.5: panic! (the `impossible' happened, GHC version 6.5): tcSyntaxOp "noSyntaxExpr"); anyone seen this before?
<monochrom> (in practice, wasting space is a good way to waste time too)
<JaffaCake> thou: think that was introduced yesterday
<thou> i figured something like that, just wanted to check. thanks, JaffaCake
<JaffaCake> it smells like Simon PJ's fault :)
<JaffaCake> and he's not about today
<skew> lazy foldl (+) 0 [1,2,3] accumulates like foldl (+) (0+1) [2,3] => foldl (+) ((0+1)+2)  =>foldl (+) (((0+1)+2)+3) => (((0+1)+2)+3, then uses space forcing that expression
<CosmicRay> hmm. from my memory, in ocaml, foldr is also more efficient, but for a different reason
<CosmicRay> err, no maybe I'm mixed up.
<thou> JaffaCake: thanks
<tromp> foldr stacks many function applications
<tromp> which cannot be made strict
<CosmicRay> hmm, in ocaml, foldl is more efficient because it is tail-recursive
<tromp> foldr leaves applications in the accumalator which can be made strict
<tromp> change last foldr to foldl:)
<CosmicRay> so, if I don't force the strictness in haskell, foldr is the more efficient?
<CosmicRay> wouldn't it be possible for the optimizer to see what's going on with foldl and evaluate that expression that's stacking up immediately?
<tromp> about same memeory wise, but foldl is tail recursive
<monochrom> There was a subtlety in the statements I made. On space complexity of foldl (+) 0 [1..n], my statement is about space of "foldl (+) 0", not of the part [1..n].
--> _JusSx__ (~email@example.com) has joined #haskell
<tromp> that's what strictness analysis might achieve, CosmicRay
<Darius> CosmicRay: Yes, but if the argument passed to foldl isn't strict then treating it as such can change the behavior of the code.
<CosmicRay> tromp: ah.
<tromp> not sure exactly what cases ghc will catch
<CosmicRay> so I'm looking at the implementation of foldl and foldr in hugs for clarity. to me it looks like one would end up with closures all over the place with either one
<CosmicRay> foldl f z  = z
<CosmicRay> foldl f z (x:xs) = foldl f (f z x) xs
<CosmicRay> foldr f z  = z
<CosmicRay> foldr f z (x:xs) = f x (foldr f z xs)
<Heffalump> foldl' exists
<CosmicRay> yes, I know
<CosmicRay> I want to understand these first :-)
<jlouis> TheHunter: applied, thanks
<JaffaCake> foldl always creates a chain of new thunks the same length as the input list
<CosmicRay> do those thunks represent (f z x) or the recursive call to foldl itself (or both?)
<monochrom> f z x
<JaffaCake> the (f z x) call is the thunk; the recursve call to foldl is tail recursive
<CosmicRay> so when my program demands the result from foldl be computed, it first runs through the entire list via the tail recursion, generating a bunch of thunks, then it evaluates all the thunks to produce the final result?
<nlv11757_> isnt it the case that only one thunk is created initially for the top-level call....and only when for example a bit of result is needed, this thunk is evaluated a bit creating a new thunk representing the tail call......
<monochrom> if naive, yes
<nlv11757_> if that is enough for the result that was needed of course, otherwise more is evaluated...
<nlv11757_> thats the idea right?
<CosmicRay> well with foldl and foldr, there is no "bit" of the result, there is all of the result or none of it
<CosmicRay> since they aren't building lists
<tromp> no, it has a lazy (thunked) representation of f (f (... (f z xn) ... x1) x0
<monochrom> yes nlv, but what is tail-recursive in eager becomes monolithic in lazy: either not computed at all or pursued to its final conclusion
<nlv11757_> ow of course in the fold case there is no bit of result, but i meant it in general for recursive functions....
<CosmicRay> monochrom: in that case, why would we get a whole chain of thunks out there with foldl?
<monochrom> For example "take 10 (map f [1..])", with lazy you'll just generate 10 items of the list.
<monochrom> The thunk looks like (((a+b)+c)+d)+e...
<monochrom> I mean the thunk looks like (((a+b)+c)+d)+e... if naively lazy.
<-- JusSx has quit (Read error: 110 (Connection timed out))
<monochrom> There is no urgency to recall 0+1 = 1 until you've finished foldl-ing
<CosmicRay> ok, I think I grok that.
<CosmicRay> so with foldr, why don't we wind up in the same situation?
--> chris2 (~chris@p54889A9B.dip0.t-ipconnect.de) has joined #haskell
<monochrom> we do.
<nlv11757_> wouldnt the first thunk be 'map f [1..]'
<monochrom> but no one whines about foldr :)
<CosmicRay> oh right, you said they both use Theta(n) space lazy.
<CosmicRay> why do we even have a foldl given foldl'?
<nlv11757_> btw there can be a bit of result when dealing with a fold
<-- metaperl_ has quit (Read error: 145 (Connection timed out))
<nlv11757_> it depends on the operator
<monochrom> I can't think of an application of the lazy foldl for the moment.
<Darius> CosmicRay: There is a semantic difference between foldl and foldl'
<Darius> But foldl is mostly useless.
<nlv11757_> see cosmicray, if there never was a bit of result to a fold......take 10 (foldr (:)  ([1 .. ])) wouldnt work
<CosmicRay> ah ok, so foldl is useful when there is in fact a bit of the result, due to laziness?
<nlv11757_> foldl is usefull when not used in a lazy sense
<nlv11757_> i think
<Darius> CosmicRay: A possible scenario when you might want to use foldl v. foldl' is described on http://www.haskell.org/hawiki/StackOverflow.
<musasabi> Created a DiceModule for lambdabot...
<monochrom> I can think up a pathological example.
<Darius> CosmicRay: Nevertheless, it's not very compelling and in almost all cases you want foldl' (unless you want foldr).
<wilx> @type foldl'
<nlv11757_> foldl1 ?
<wilx> @type foldl1
-lambdabot/#haskell- foldl1 :: forall a. (a -> a -> a) -> [a] -> a
<wilx> @type foldl
-lambdabot/#haskell- foldl :: forall a b. (a -> b -> a) -> a -> [b] -> a
<vegai> question for you ... oldtimers
<monochrom> good explanation on that wiki
<vegai> when you see the type of the complexity of e.g. foldl, do you instantly understand it?
<Lunar^> JaffaCake: Do "dynamic" wrapped functions always get their own thread?
<JaffaCake> Haskell thread? yes
<Darius> @type Data.List.foldl'
<monochrom> Almost instantly.
-lambdabot/#haskell- Data.List.foldl' :: forall a b. (a -> b -> a) -> a -> [b] -> a
<CosmicRay> thanks for all the exlpanations
After spending some time deleting this morning's spam, I'd like to suggest that trackbacks be disabled on this site for now. I haven't seen any people using them legitimately, and Drupal's spam module -- while excellent in fighting comment spam -- doesn't do anything about trackback spam yet. Comments?
In the meantime, if you see spam that's been on the site for 24 hours or more, please e-mail me.
[07:23] [metaperl] The Haskell Road To Logic, Maths And Programming has a very seductive intro and table of contents... I am starting to realize that I am going nowhere in Haskell unless I learn to thi\
nk and know functions well... humbling after 5 years as a well-paid Perl programmer
[07:23] [RemiTurk] false = \x y -> y
[07:23] [RemiTurk] if = id
[07:24] [RemiTurk] not = flip
[07:24] [shapr] metaperl: I realize I'm not learning much if I don't get humbled on a regular basis.
[07:24] [xerox] metaperl: I am reading it too!
[07:24] [metaperl] xerox, do you like it?
[07:24] [xerox] metaperl: I'm just at page 28, good so far.
[07:25] [shapr] http://lambda-the-ultimate.org/ is one good place to find humbleness. I also like to put forth my ideas in my blog and here on #haskell and ask people to poke holes. That mostly works.
[07:27] [xerox] metaperl: do you like it?
[07:27] [metaperl] I'm thinking about buying it
[07:28] [shapr] Isn't it free online?
[07:28] [metaperl] I think you are just playing with air until you lock in a rigourous command of logic and functions. And this book doesnt play around... it takes you right to the foundation and build\
s you up from there
[07:28] [metaperl] the first chapter and toc is
[07:28] [metaperl] all the other books give you a chance to fool yourself about knowing haskell --- not this one
[07:28] [metaperl] this book puts logic first and the language second
[07:28] [xerox] shapr: no it isn't :\
[07:29] [metaperl] all the others put the language first and hope you can build the logical reasoning... and it wont happen
[07:29] [shapr] Strange, I wonder where I got my ps.gz copy.
[07:29] [xerox] Its first example is a prime number test, it first prove to you some things -mathematecally- then implement it in Haskell, err, rewrites the proofs in Haskell :D
[07:30] [xerox] shapr: hm, it could be really.
[07:30] [xerox] http://homepages.cwi.nl/~jve/HR/
I think Haskell is the best computer programming language. I say so for two reasons. First, the community of people is excellent. I have yet to see one arrogant hothead anywhere. I have had discussions with some of the best Haskell programmers out there with absolutely no sense of them thinking they are better than me. Having been around Perl for 5-7 years, let me tell you, it is rough in Perl world. And there are plenty of arrogant, egotistical self-centered people in that community. In fact, I suppose being around them has made me that way as well. Haskell is also the best computer programming language because of it's wonderful mix of strong typing and pure functional approach. Finally, haskell is an amazing project to be the result of a community effort. How in the world could a group of people get together and not simply fight and bicker and never agree? Perl was a 1-man language. Emacs was a 1-man language. Ditto for Tcl, Python and I guess PHP. Unix was a 2-man project.
Anyway, now that I have said how great a _language_ Haskell is, let us note its chief deficiency. No available functionality. I remember 2 weeks ago Philippa was asking in #haskell about whether another person had a CGI extension to deal with cookies... Perl has had this since the dot-com era! Perl has search.CPAN.org, which is a huge library of modular functionality for a plethora of real-world scenarios.
Now, let's move on to PHP. From a purely linguistic perspective, PHP is a worse language than Perl: http://tnx.nl/php. BUT from an application perspective, it outdoes Perl. I did not say from a modular/functionality perspective, but from the application perspective.
Applications are tools that are ready to run. Modules are pluggable units of functionality to build applications. Functions are even smaller units of pluggable functionality to build applications. Abstractly, applications, modules and functions all fill a gap. Applications fill the largest gap, modules fill the medium gap and functions fill the smallest gap. The relation between immediate ability to fill a gap and language quality is inversely proportional. Or, concretely: PHP sucks as a language, but rules in filling the biggest gap in apps. Haskell rules as a language, but sucks in filling application and modular gaps. And Perl is right in the middle. It is a decent language with great ability to fill in the middle gaps. It is not as composable as haskell and thus cannot compete in filling in small gaps. It lacks good, out-of-the-box applications like phpMyAdmin and Drupal and postNuke, and phpBB.
/me dons flame-retardant vest.
Spend your time defining and composing. If you are trying to get something done in 2 lines, then you are trying too hard and have not separated out the task fully.
It's hard for imperative-trained programmers to grasp this, but it is vitally important.
[11:19] [Cale] update x n xs = take n xs ++ x : drop (n+1) xs
Unlike an imperative language, Haskell is purely functional. This means when you look at something, it will (a) always return the same output for certain input (b) only return output based on the input you are looking at (c) do nothing but return output.
This means each expression is Haskell is referentially transparent and stands on its own. Understand the function in an expression and you understand the expression.