2024/04/30

Newest at the top

2024-04-30 22:00:11 +0200 <int-e> Increasing -F is also a possibility I guess, but it feels like a rather blunt tool.
2024-04-30 21:59:14 +0200 <tomsmeding> int-e: at least for the program in question, -G3 doesn't help
2024-04-30 21:58:12 +0200 <mauke> https://downloads.haskell.org/ghc/latest/docs/users_guide/runtime_control.html#rts-options-to-cont…
2024-04-30 21:56:53 +0200 <int-e> (RTS option to set the number of generations)
2024-04-30 21:56:36 +0200 <int-e> does -G 3 ever help with this kind of repeated GC?
2024-04-30 21:56:32 +0200 <tomsmeding> it would just make it harder to believe that the claims we're making about how the algorithms work actually work in practice
2024-04-30 21:56:31 +0200 <monochrom> OK, but the reference implementation is allowed to be slow. :)
2024-04-30 21:56:28 +0200sawilagar(~sawilagar@user/sawilagar)
2024-04-30 21:56:09 +0200 <tomsmeding> I could also just not write the implementation and the paper would not be much weaker
2024-04-30 21:55:53 +0200 <tomsmeding> the reference implementation is there to show that the essence of the algorithm that we get halfway is not nonsense
2024-04-30 21:55:32 +0200 <tomsmeding> monochrom: the point of the paper is that it optimises a really crappy algorithm from theory to a known fast thing
2024-04-30 21:55:03 +0200 <monochrom> Um ideally a paper is not supposed to contrive things just for the sake of "interesting" and "new". At least, one can hope...
2024-04-30 21:55:00 +0200 <EvanR> you already know that's 1. the slow part and 2. going to be too slow
2024-04-30 21:54:58 +0200 <tomsmeding> my structure doesn't even have any cycles, you could reference count GC it and all would be good!
2024-04-30 21:54:41 +0200 <EvanR> worrying about the performance of adding to a compact region seems like,
2024-04-30 21:54:24 +0200 <tomsmeding> mauke: that's a neat quote for that
2024-04-30 21:53:35 +0200 <mauke> by which I mean that under a copying collector, anything not traversed (and copied) will be collected implicitly
2024-04-30 21:53:29 +0200benjaminl(~benjaminl@user/benjaminl)
2024-04-30 21:53:05 +0200 <tomsmeding> but >.<
2024-04-30 21:52:57 +0200 <tomsmeding> but I don't want to implement that version because then, in context of the paper, the implementation is completely uninteresting because it's something that already exists
2024-04-30 21:52:37 +0200 <tomsmeding> destroying all problems in one fell swoop
2024-04-30 21:52:26 +0200 <tomsmeding> the last optimisation step for the algorithm in the paper is to make the whole thing imperative, which for the implementation would mean that this big data structure just becomes a huge unboxable array
2024-04-30 21:52:24 +0200 <EvanR> lol
2024-04-30 21:52:04 +0200 <mauke> https://marcopeg.com/content/images/size/w2000/2021/11/everything-not-saved-will-be-lost.png
2024-04-30 21:51:57 +0200 <tomsmeding> my problem is, this is a reference implementation for a paper
2024-04-30 21:51:41 +0200peterbecich(~Thunderbi@47.229.123.186) (Ping timeout: 240 seconds)
2024-04-30 21:51:15 +0200 <EvanR> rewrite it in rust
2024-04-30 21:50:31 +0200 <tomsmeding> that's why I want the GC to ignore this thing, it's absolutely pointless to traverse it!
2024-04-30 21:50:18 +0200 <tomsmeding> everything that I add to it stays until the second phase of the program
2024-04-30 21:50:10 +0200 <monochrom> :(
2024-04-30 21:50:06 +0200 <tomsmeding> at least not in this structure
2024-04-30 21:50:03 +0200 <tomsmeding> monochrom: there is no dead data in my application
2024-04-30 21:49:45 +0200 <monochrom> In the sense that "[m..n] = m : [m+1 .. n]" is a thunk that generates a little data and a new thunk, and although you still have to GC, it fruitfully collects dead data, and the surviving new thunk is still small, so the overall memory footprint stays small..
2024-04-30 21:49:44 +0200 <tomsmeding> maybe I should try changing that
2024-04-30 21:49:40 +0200 <tomsmeding> so it's all thunks anyway
2024-04-30 21:49:31 +0200 <tomsmeding> I mean, the whole structure _is_ lazy
2024-04-30 21:49:21 +0200 <int-e> presumably this is not one of those cases
2024-04-30 21:49:04 +0200 <int-e> sometimes laziness makes the live data smaller
2024-04-30 21:47:01 +0200 <tomsmeding> I don't see how making things thunks, or doing the difference-list thing, etc. helps there
2024-04-30 21:46:33 +0200 <tomsmeding> monochrom: I'm not exactly sure what you mean; the problem that I have is that the GC uselessly traverses my huge data structure that is live anyway
2024-04-30 21:46:22 +0200 <EvanR> data is definitely data while closures might contain a bunch of closures, or thunks. So closures are the better bet xD
2024-04-30 21:43:26 +0200 <monochrom> But changing to thunks can be worthwhile if it enables streaming.
2024-04-30 21:42:55 +0200 <monochrom> That would be what I said about "just changes data to thunks".
2024-04-30 21:42:46 +0200benjaminl(~benjaminl@user/benjaminl) (Ping timeout: 246 seconds)
2024-04-30 21:41:25 +0200 <tomsmeding> you just get a network of closures with the exact same structure as the original data structure :)
2024-04-30 21:41:09 +0200 <tomsmeding> you just create indirection
2024-04-30 21:41:02 +0200 <tomsmeding> if you replace the nodes of the data structure by closures, you don't make the heap size any smaller
2024-04-30 21:40:29 +0200yin(~yin@user/zero) (Ping timeout: 240 seconds)
2024-04-30 21:40:24 +0200 <monochrom> I wonder if "diff list but for snoc list" helps.
2024-04-30 21:38:07 +0200 <tomsmeding> it would reduce the peak heap size a lot