code logs -> 2007 -> Wed, 17 Oct 2007< code.20071016.log - code.20071018.log >
--- Log opened Wed Oct 17 00:00:05 2007
00:01 Forj [~Forj@Nightstar-2310.ue.woosh.co.nz] has quit [Quit: Gone]
00:03 GeekSoldier_ [~Rob@Nightstar-5322.pools.arcor-ip.net] has joined #code
00:21 ReivZzz [~reaverta@Admin.Nightstar.Net] has joined #Code
00:41 Chalcy [~Chalcedon@Nightstar-2310.ue.woosh.co.nz] has joined #code
00:41 mode/#code [+o Chalcy] by ChanServ
00:42 Chalcedon [~Chalcedon@Nightstar-2310.ue.woosh.co.nz] has quit [Ping Timeout]
00:46 GeekSoldier_ [~Rob@Nightstar-5322.pools.arcor-ip.net] has quit [Ping Timeout]
01:33 You're now known as TheWatcher[T-2]
01:36 gnolam [~lenin@Nightstar-10613.8.5.253.static.se.wasadata.net] has quit [Quit: Z?]
01:38 Chalcy is now known as Chalcedon
01:38 You're now known as TheWatcher[zZzZ]
03:20 Doctor_Nick [~nick@Nightstar-23600.hsd1.fl.comcast.net] has quit [Operation timed out]
03:21
<@ToxicFrog>
Hmm.
03:21
<@ToxicFrog>
I guess it would help if population.new actually returned the newly created population.
03:22
<@ToxicFrog>
...it would also help if I weren't using the same name for a namespace and a function argument.
03:22 Doctor_Nick [~nick@Nightstar-23600.hsd1.fl.comcast.net] has joined #code
04:27 GeekSoldier_ [~Rob@Nightstar-5322.pools.arcor-ip.net] has joined #code
04:31 GeekSoldier_ [~Rob@Nightstar-5322.pools.arcor-ip.net] has quit [Ping Timeout]
04:54 Raif [~corvusign@Nightstar-25074.hsd1.wa.comcast.net] has joined #Code
07:16 * McMartin plays with Haskell.
08:19
<@McMartin>
Wow, ghc is mind-bogglingly space-inefficient if you're doing any serious amount of I/O.
08:20
<@McMartin>
On the order of several hundred KB per line, scaling linearly.
08:54 You're now known as TheWatcher
09:11 Chalcedon is now known as ChalcyAFK
09:48 gnolam [lenin@Nightstar-10613.8.5.253.static.se.wasadata.net] has joined #Code
09:48 mode/#code [+o gnolam] by ChanServ
10:09 ReivZzz is now known as Reiver
10:46 ChalcyAFK [~Chalcedon@Nightstar-2310.ue.woosh.co.nz] has quit [Quit: Gone]
11:43 Xiphias [Ameroth@Nightstar-25243.dsl.in-addr.zen.co.uk] has joined #code
12:09 Xiphias [Ameroth@Nightstar-25243.dsl.in-addr.zen.co.uk] has quit [Quit: I was never gone]
13:49 Mischief [~Genesis@Nightstar-7565.hsd1.md.comcast.net] has quit [Ping Timeout]
14:05
<@ToxicFrog>
o.O
14:13
<@ToxicFrog>
$ lua ga.lua
14:13
<@ToxicFrog>
0.7 0.62 0.62 0.66 0.64 0.74 0.7 0.66 0.84 0.8 0.8 0.88 0.88 0.88 0.88 0.96
14:13
<@ToxicFrog>
Done after 16 iterations.
14:13
<@ToxicFrog>
Woo!
15:10 gnolam [lenin@Nightstar-10613.8.5.253.static.se.wasadata.net] has quit [Quit: Nuking my computer.]
15:34 You're now known as TheWatcher[afk]
15:51
< Doctor_Nick>
hallabalua!
16:18 Reiver is now known as ReivZzz
16:24 You're now known as TheWatcher
16:26
<@ToxicFrog>
Memoizing the fitness function gives it nearly a twofold speed increase.
16:37
< MyCatVerbs>
ToxicFrog: sounds about right, assuming about a third of the population survives unaltered each generation.
16:38
<@ToxicFrog>
Very little, if any, of the population survives unaltered.
16:38
<@ToxicFrog>
However, fitness() gets hit a lot.
16:42 GeekSoldier_ [~Rob@Nightstar-5512.pools.arcor-ip.net] has joined #code
17:09 You're now known as TheWatcher[afk]
18:13
< MyCatVerbs>
ToxicFrog: surely fitness() only gets hit once for each pop member each generation?
18:13 AngryDrake [AnnoDomini@Nightstar-29508.neoplus.adsl.tpnet.pl] has joined #Code
18:13 AnnoDomini [AnnoDomini@Nightstar-28906.neoplus.adsl.tpnet.pl] has quit [Ping Timeout]
18:13 * MyCatVerbs when writing GAs always maps the fitness function across the pop array as the first step, and works from there.
18:13 AngryDrake is now known as AnnoDomini
18:19 You're now known as TheWatcher
18:30 gnolam [lenin@Nightstar-10613.8.5.253.static.se.wasadata.net] has joined #Code
18:30 mode/#code [+o gnolam] by ChanServ
18:32
<@ToxicFrog>
MyCatVerbs: no, it gets hit when get_best is called, and in a bunch of other places
18:41 GeekSoldier_ is now known as GeekSoldier
18:44
< MyCatVerbs>
ToxicFrog: yikes! No wonder you had to memoize.
18:44
<@ToxicFrog>
It was easier to write it that way; once it became clear that that was a bottleneck, I memoized it.
18:44 * MyCatVerbs usually just keeps the populations' individual fitness values in memory along with them, or in an array with the same indexes as them to avoid recalculating.
18:45
<@ToxicFrog>
Arrays with floating point indexes are harder to iterate over.
18:45
< MyCatVerbs>
I thought you meant memoization between generations, which only really helps if a fairly heft proportion of the critters survive each generation.
18:45
<@ToxicFrog>
Well, it works for that too.
18:45
< MyCatVerbs>
ToxicFrog: why floating point indexes?
18:45
<@ToxicFrog>
It's memoization in general.
18:45
<@ToxicFrog>
If the same chromosome pops up in different generations, it's still memoized.
18:45
<@ToxicFrog>
Because that's how I calculate fitness for this particular problem?
18:46
< MyCatVerbs>
Yeah, but what I mean is that the way I do it is only relevant between generations, so that's the only place where the performance difference it (could, potentially) make ever shows up.
18:46
<@ToxicFrog>
Also, I note that mapping fitness to chromosome runs into problems when you have multiple chromosomes with the same fitness.
18:46
< MyCatVerbs>
Huh. I'm starting from an array of popmembers, and mapping popmember[i] -> fitnesses[i]. Integer subscripts, floating point array.
18:47 Forj [~Forj@Nightstar-2310.ue.woosh.co.nz] has joined #code
18:47 mode/#code [+o Forj] by ChanServ
18:48
<@ToxicFrog>
Oh. That's what you meant by "array with the same indexes"
18:48
<@ToxicFrog>
If I were going to do that sort of thing, I'd choose a chromosome representation that let me encode fitness into it, but memoizing fitness accomplishes the same goal and is easier.
18:49
< MyCatVerbs>
ToxicFrog: you could just use two arrays, or a tuple of (fitness,chromosome).
18:49
<@ToxicFrog>
Isn't that what I just said?
18:50
< MyCatVerbs>
Seems like it. I'm trying to get across that I don't see how adding memoization is easier than doing that, unless whoever wrote your programming language made function memoization a language primitive.
18:52
< MyCatVerbs>
Plus, it's a tad faster unless the inter-generation savings are significant.
18:53
< MyCatVerbs>
(OTOH, if somebody *did* make "memoize this function" a Lua primitive, then kudos to them, heh.)
19:03
<@ToxicFrog>
It's not a language primitive, but it is something I already had lying around.
19:03
< MyCatVerbs>
Fair 'nuff.
19:03
< MyCatVerbs>
And, pity, that would've been awesome. :)
19:03
<@ToxicFrog>
It's easier to abstract all requests for fitness to a fitness() function than worry about implementation, and it's easier to memoize that function once written (fitness = memoize(fitness)) than to implement cacheing.
19:08
< MyCatVerbs>
I see.
19:15
<@ToxicFrog>
I may change this later if it turns out that fitness() is still a major bottleneck, but until then it's a lot of effort - relatively speaking - for no real gain.
19:17
< MyCatVerbs>
Mmmmhmmm. S'just that, the order I'd usually write a GA in, cacheing fitness() just sort of comes about more or less for free.
19:24 mode/#code [+o AnnoDomini] by ChanServ
19:45
<@ToxicFrog>
Well, fitness() is going to be a function no matter what, as there's no guarantee that the chromosome representation gives me fields.
19:46
<@ToxicFrog>
And once it's a function, which is easier, making sure that all my chromosome implementations cache fitness, or just memoizing it?
20:17 GeekSoldier [~Rob@Nightstar-5512.pools.arcor-ip.net] has quit [Ping Timeout]
20:25 GeekSoldier [~Rob@Nightstar-4040.pools.arcor-ip.net] has joined #code
20:26 GeekSoldier is now known as GeekSoldier|bed
20:30 GeekSoldier_ [~Rob@Nightstar-5344.pools.arcor-ip.net] has joined #code
20:31 GeekSoldier|bed [~Rob@Nightstar-4040.pools.arcor-ip.net] has quit [Ping Timeout]
20:34 GeekSoldier_ [~Rob@Nightstar-5344.pools.arcor-ip.net] has quit [Ping Timeout]
20:43 Forj [~Forj@Nightstar-2310.ue.woosh.co.nz] has quit [Quit: Gone]
20:49 Chalcedon [~Chalcedon@Nightstar-2310.ue.woosh.co.nz] has joined #code
20:49 mode/#code [+o Chalcedon] by ChanServ
20:55
<@McMartin>
Hmm. Hey, MCV, still around?
22:20
< MyCatVerbs>
Yes.
22:21
< MyCatVerbs>
I just happen not to check most rooms on IRC very often. Poke me by name, though, and irssi's highlighting will summon me more rapidly, assuming I'm actually here. >>
22:21
< MyCatVerbs>
How can I help you, anyhoo?
22:38
<@McMartin>
Aha
22:38
<@McMartin>
I was playing with Haskell and getting unbelievably large code sizes
22:38
<@McMartin>
On the order of several tens of KB of binary per line
22:39
<@McMartin>
(Working through YAHT - Numbers.hs was the test file.)
22:39
<@McMartin>
I was wondering if this was typical for highly monadic code, if it meant ghc's x86_64 support wasn't great yet, or if there was some kind of flag I was missing.
22:40
<@McMartin>
Also, the KB/line bit here is a difference between adding a bunch of print statements and then removing them, so as -- I thought -- to control for linking in stuff like the garbage collector
22:43
<@McMartin>
http://en.wikibooks.org/wiki/Haskell/YAHT/Language_basics/Solutions#Interactivit y
22:43 * McMartin then recoded it in OCaml and C++ just for comparison.
22:43
<@McMartin>
C++ was unreasonably low, and I think that has to do with libstdc++ improving since the last time I used it.
22:44
<@McMartin>
Since it was about 20K, and I remember back when cout << "Hello world!" << endl; was 100.
22:45
<@McMartin>
(Numbers.ml: ~150KB; Numbers.hs: ~900KB, and adding a dozen or so StrPutLns in front of the main program kicked it up to 1.2MB.)
22:46
< MyCatVerbs>
Uh.
22:46
< MyCatVerbs>
GHC doesn't have dynamic linking yet. All Haskell binaries are static, except (apparently) on Macs.
22:47
<@McMartin>
That would explain the ~900KB, but not the 1.2MB.
22:47
< MyCatVerbs>
For example, you use *anything* in System.Posix, you get the whole Unix library built in.
22:47
<@McMartin>
(That said, ldd doesn't seem to agree, but I haven't checked to see if it actually uses it)
22:48
< MyCatVerbs>
McMartin: not sure how it works exactly, but all the Haskell libraries are built in. I think some parts of the run-time system are shared though, thankfully.
22:49
< MyCatVerbs>
Try removing all but one of the putStrLn's prefixed to the front and see if the executable size changes?
22:53
<@McMartin>
Yeah, sec
22:54
<@McMartin>
Actually, easier way to do this is with Hello, World and Hello World + Hello Mars.
22:56
<@McMartin>
OK, the difference between one call to PutStrLn and two calls to it is 2,444 bytes of object code.
22:57
<@McMartin>
And then 548 between 2 and 3.
23:01
<@McMartin>
(Also, FWIW, ghc seems to link in libs c, m, rt, dl, gmp, and pthreads.)
23:05
< MyCatVerbs>
GMP is used because the Integer type is arbitrary-precision, and switches to libgmp's huge integer type automatically on overflow.
23:06 * McMartin nods
23:06
<@McMartin>
I assume the pthreads are for some kind of automatic parallelization that FP is supposed to be decent at?
23:07
< MyCatVerbs>
Pthreads is pulled in for various reasons. One of them is the fact that file IO is done in Haskell by a second thread aside from the main computation.
23:07
< MyCatVerbs>
McMartin: I do not believe that there's any automatic parallelization, but never fear because it's ridiculously easy to get GHC to parallelize your code.
23:09
< MyCatVerbs>
The easiest way is to use the Control.Parallel module, which ships with GHC. There's an operator called "par", which, uh, fuckit. Here's the documentation, I can't explain it better than that.
23:10
< MyCatVerbs>
If you do: par longexpensivecomputationa anotherlongexpensivecomputationthathalfwaythroughdependsontheresultoflongexpensi vecomputationa, then you get really easy parallelism. :)
23:11
< MyCatVerbs>
Er, par a b, where a and b are expensive computations, and b depends somewhere (not right at the beginning) on the value of a.
23:12
< MyCatVerbs>
Also you have to give a compiler argument, -threaded, for GHC, to get any benefit from that. >>
23:12
<@McMartin>
nod
23:12
<@McMartin>
Beyond my current ken.
23:13
< MyCatVerbs>
But, basically everything is thread-safe in Haskell, and the people who maintain the libraries and the compiler slit your throat if you aren't.
23:14
<@McMartin>
It's hard to not be with no state.
23:15
< MyCatVerbs>
Indeed. Well. You often have state in practice, which means you don't really have instant parallelism, but referential transparency makes it safe to try anyway.
23:15
< MyCatVerbs>
Also, the CPU profiling in Haskell is pretty spiffy. ^^
--- Log closed Thu Oct 18 00:00:12 2007
code logs -> 2007 -> Wed, 17 Oct 2007< code.20071016.log - code.20071018.log >