Newest at the top
| 2026-03-10 13:43:16 +0100 | <Freakie> | do you know if the compact regions are deep copied? |
| 2026-03-10 13:42:51 +0100 | <Freakie> | and then sorting them lazily (i.e. when you actually need to traverse said level) |
| 2026-03-10 13:42:24 +0100 | <Freakie> | otherwise I was planning on using the levelwise layout of the BDD to separate the closest data (for example what needs to be processed on the current and next level) from the priority queue itself |
| 2026-03-10 13:41:02 +0100 | <Freakie> | maybe a compact PQ could help but then I suppose the region will just keep growing till the algorithm is done? |
| 2026-03-10 13:40:33 +0100 | <Freakie> | I know, I'm just not working with many other data structures than the BDDs themselves (which are represented by lists, and only traversed during algorithms) and the priroity queue |
| 2026-03-10 13:40:31 +0100 | <comerijn> | Including (parts) of your priority queue |
| 2026-03-10 13:40:06 +0100 | <comerijn> | Freakie: I mean, you don't have to limit yourself to compact individual BDD. You can compact essentially any arbitrary tree of data |
| 2026-03-10 13:39:31 +0100 | <comerijn> | Depends how often you trigger GC ;) |
| 2026-03-10 13:39:26 +0100 | <Freakie> | I can utilize the levelwise layout of the lists though |
| 2026-03-10 13:39:12 +0100 | <Freakie> | yeah I jjust mean to say that the extra cost of traversing the input lists shouldn't be the deciding factor |
| 2026-03-10 13:38:51 +0100 | <comerijn> | Freakie: The same applies for any of the data during the algorithms :) |
| 2026-03-10 13:38:46 +0100 | <Freakie> | for n = 8 the largest BDD generated only has like 10k nodes while the number of insertions to the PQ can hit double digits |
| 2026-03-10 13:38:36 +0100 | <comerijn> | ok, so that's probably not where your cost is |
| 2026-03-10 13:38:09 +0100 | <Freakie> | the thing is just that the BDDs themselves are not that large relative to the data allocated during the algorithms |
| 2026-03-10 13:38:04 +0100 | <comerijn> | Freakie: When you create a compact region the GC copies a (transitive) data structure from the heap to a separate region, and then during GC we either free the entire region at once or only copy the pointer to the region to the new heap. Reducing GC cost from however large the data to 1 pointer |
| 2026-03-10 13:36:56 +0100 | <comerijn> | Freakie: Ok, so you have some tree data structure scattered across your heap, right? Every GC you copy that entire thing to a new heap, then wipe the old. If that tree is huge and you keep it around a long time, that's a lot of continual copying |
| 2026-03-10 13:36:52 +0100 | <Freakie> | I think it wouldn't help much though because almost all of the allocations in my program comes manipulating the priority queue (regardless of the implementation) |
| 2026-03-10 13:36:00 +0100 | <Freakie> | so basically you mean pinning the BDDs while traversing them? |
| 2026-03-10 13:35:31 +0100 | <comerijn> | Freakie: Where compact regions help is if you have a large BDD that you keep around indefinitely, by shoving it off into a compact region it basically no longer counts as live data for the GC scaling |
| 2026-03-10 13:35:31 +0100 | <Freakie> | because the children can be anywhere on a deeper level |
| 2026-03-10 13:35:17 +0100 | <Freakie> | the point of the priority queue is to extract all relevant data from a particular node before progressing past it, but there's no neat structure in terms of compactness |
| 2026-03-10 13:34:19 +0100 | <Freakie> | other times it's just traversal for e.g. satcount |
| 2026-03-10 13:34:10 +0100 | <Freakie> | a separate algorithm then builds up the BDD again |
| 2026-03-10 13:33:57 +0100 | <Freakie> | the apply algorithm takes two bdds (lists) and outputs lists of arcs that make up an unreduced bdd |
| 2026-03-10 13:33:35 +0100 | <Freakie> | yes and no? depends on the algorithm |
| 2026-03-10 13:33:16 +0100 | <comerijn> | Freakie: So you're incrementally building the representation of a BDD? |
| 2026-03-10 13:32:44 +0100 | <Freakie> | so the priority queue gives the most relevant information based on where in the list I am |
| 2026-03-10 13:32:35 +0100 | <comerijn> | Freakie: The point of regions is that you have, say 500 MB of data that is live for your entire algorithm then you reduce the GC cost of that data from "copy 500 MB" to "copy a pointer to the compact region" |
| 2026-03-10 13:32:08 +0100 | <Freakie> | yeah so I have a list representation of the BDD nodes that I need to either work on top down or bottom up; in either case I need to get as much information as possible from the input lists before proceeding to the tail |
| 2026-03-10 13:31:20 +0100 | <comerijn> | Yeah |
| 2026-03-10 13:31:03 +0100 | <Freakie> | are you familiar with binary decision diagrams? just for the sake of terminology |
| 2026-03-10 13:30:47 +0100 | <Freakie> | I'm a little unsure how compact regions might help since other than what I just mentioned I don't think there are a lot of options to separate the regions during the iterations of the algorithm itself |
| 2026-03-10 13:30:44 +0100 | <comerijn> | What are you putting in the queue? |
| 2026-03-10 13:30:12 +0100 | <Freakie> | which shouldn't help asymptotically but should reduce the dependency on hte priority queue |
| 2026-03-10 13:29:34 +0100 | <comerijn> | Freakie: this blog https://www.channable.com/tech/lessons-in-managing-haskell-memory |
| 2026-03-10 13:29:33 +0100 | <Freakie> | I'm planning on restructuring the code so that it can take use of the layout of the lists I'm working with |
| 2026-03-10 13:29:03 +0100 | <comerijn> | hmm |
| 2026-03-10 13:29:00 +0100 | <Freakie> | for both of those |
| 2026-03-10 13:28:58 +0100 | <Freakie> | it didn't seem to have much of an impact when I tried but I could try again |
| 2026-03-10 13:28:53 +0100 | <comerijn> | Might wanna try tweaking the number of generations too? |
| 2026-03-10 13:28:31 +0100 | <comerijn> | Freakie: Try the default GC with the -qg runtime option? |
| 2026-03-10 13:28:28 +0100 | <Freakie> | it's an acceptable time loss compared to being able to actually scale the problem (if it doesn't get worse for even greater input sizes anyway) |
| 2026-03-10 13:28:10 +0100 | <Freakie> | I mean yeah productivity suffered *technically* but the runtime was about the same |
| 2026-03-10 13:27:47 +0100 | <comerijn> | But in terms of productivity the default GC is better |
| 2026-03-10 13:27:47 +0100 | <Freakie> | nonmoving-gc roughly halved residency and total memory in use for the largest input size I've tested |
| 2026-03-10 13:27:29 +0100 | <comerijn> | The main downside of the moving GC is that it causes longer GC pauses (bad for latency sensitive/real-time applications) |
| 2026-03-10 13:27:05 +0100 | <comerijn> | Freakie: then the regular one |
| 2026-03-10 13:26:57 +0100 | <comerijn> | Freakie: nonmoving-gc will generally have lower throughput |
| 2026-03-10 13:26:56 +0100 | <Freakie> | no, I thought that would be too difficult to reason |
| 2026-03-10 13:26:42 +0100 | <comerijn> | Freakie: Are you running threaded runtime and large number of capabilities? |