2026/07/05

Newest at the top

2026-07-05 13:06:44 +0000merijn(~merijn@host-cl.cgnat-g.v4.dfn.nl) merijn
2026-07-05 13:01:36 +0000 <tomsmeding> but there'll be a minority who has usecases for [a] that fall through the cracks of the optimisations and get worse, and they'll complain :)
2026-07-05 13:00:55 +0000 <tomsmeding> I think a decently large portion of haskell users would be happy with such a change, because it's "obviously useful"
2026-07-05 13:00:07 +0000 <jaror> We could store a simple RC bit in the constructor
2026-07-05 12:59:47 +0000 <jaror> It could also take into account how many references there are to the elements to decide how to chunk
2026-07-05 12:59:39 +0000 <tomsmeding> hm, probably less than double, but still more because of the duplication of the second-element pointer
2026-07-05 12:59:11 +0000 <tomsmeding> (the chunks would need to be quite small, because you can have pathological cases where you create a Hd1 and then many copies of that list with one additional element added; that results in double the list memory overhead as when you'd do the same trick with normal [a])
2026-07-05 12:57:25 +0000 <tomsmeding> -> possible, but significant additional administration
2026-07-05 12:57:15 +0000 <tomsmeding> that's exactly what I was also thinking of
2026-07-05 12:57:04 +0000 <tomsmeding> but there are bound to be uses of [a] in the wild that get disproportionally punished by List3 somehow
2026-07-05 12:56:52 +0000 <jaror> I wonder if we could have special "offset" pointers which the GC can take into account and then split the chunks when collecting
2026-07-05 12:56:43 +0000merijn(~merijn@host-cl.cgnat-g.v4.dfn.nl) (Ping timeout: 264 seconds)
2026-07-05 12:56:22 +0000eggplantade(~eggplanta@2600:1702:8450:c370:9436:d382:832a:2336) (Ping timeout: 240 seconds)
2026-07-05 12:56:09 +0000 <jaror> ah
2026-07-05 12:56:03 +0000 <tomsmeding> (and which thus must be backwards-compatible)
2026-07-05 12:55:54 +0000 <tomsmeding> (I'm talking about a hypothetical world where people would be willing to specially handle [a] in GHC and the RTS)
2026-07-05 12:55:30 +0000 <tomsmeding> well, if the individual elements are large (but boxed so just a single pointer), then retaining even just one redundant element can be problematic
2026-07-05 12:55:00 +0000 <jaror> And if you only chunk a few elements at a time (say 64 bytes) then I don't think there will be that many issues with retaining things for too long
2026-07-05 12:53:55 +0000 <tomsmeding> at which point you get very close to your List3
2026-07-05 12:53:51 +0000 <jaror> Yes
2026-07-05 12:53:40 +0000 <tomsmeding> jaror: yes, so the proposal as stated falls flat, but you could easily imagine a lazy-bytestring-like chunked representation that does hvae the same asymptotics
2026-07-05 12:52:58 +0000 <tomsmeding> but I do wonder if there is something that GHC could, potentially, do here in terms of chunking lists; the issue is that it would introduce a bunch of additional administration, in particular to ensure that list elements can be properly GC'd when they're not referenced any more, always
2026-07-05 12:52:57 +0000 <jaror> I mean, text is asymptotically slower for many things
2026-07-05 12:51:56 +0000 <tomsmeding> I think that suggestion was mostly dismissed as "but the point of [a] is that it's a list"
2026-07-05 12:51:52 +0000merijn(~merijn@host-cl.cgnat-g.v4.dfn.nl) merijn
2026-07-05 12:51:14 +0000eggplantade(~eggplanta@2600:1702:8450:c370:9436:d382:832a:2336)
2026-07-05 12:50:55 +0000 <tomsmeding> I think someone in this channel at some point proposed that GHC ought to implement String more efficiently, transparently using a Text-like representation under the hood
2026-07-05 12:50:23 +0000lisbeths(uid135845@id-135845.lymington.irccloud.com) (Quit: Connection closed for inactivity)
2026-07-05 12:49:55 +0000 <tomsmeding> (to nobody's surprise)
2026-07-05 12:49:46 +0000 <tomsmeding> "computers like arrays"
2026-07-05 12:49:33 +0000 <tomsmeding> nice, that confirms the hypothesis about why traversal was faster
2026-07-05 12:42:12 +0000ftzm427ftzm
2026-07-05 12:42:09 +0000ftzm427(~ftzm@85.80.244.25)
2026-07-05 12:41:55 +0000 <jaror> but obviously does not give you O(log n) access or length computation
2026-07-05 12:41:30 +0000 <jaror> That has less construction overhead and more summing gains
2026-07-05 12:41:19 +0000merijn(~merijn@host-cl.cgnat-g.v4.dfn.nl) (Ping timeout: 276 seconds)
2026-07-05 12:41:07 +0000Vizious(~bes@user/Vizious) Vizious
2026-07-05 12:40:46 +0000 <jaror> data List3Tail a = Nil3 | Cons3 a a a !(List3Tail a)
2026-07-05 12:40:46 +0000 <jaror> data List3 a = Hd0 !(List3Tail a) | Hd1 a !(List3Tail a) | Hd2 a a !(List3Tail a)
2026-07-05 12:40:46 +0000 <jaror> I've now also benchmarked a simple list with 3 element cells:
2026-07-05 12:39:45 +0000lnxxcya(~lynxx@i577BF0A5.versanet.de)
2026-07-05 12:38:29 +0000eggplantade(~eggplanta@2600:1702:8450:c370:9436:d382:832a:2336) (Ping timeout: 242 seconds)
2026-07-05 12:36:26 +0000merijn(~merijn@host-cl.cgnat-g.v4.dfn.nl) merijn
2026-07-05 12:32:46 +0000eggplantade(~eggplanta@2600:1702:8450:c370:9436:d382:832a:2336)
2026-07-05 12:25:45 +0000merijn(~merijn@host-cl.cgnat-g.v4.dfn.nl) (Ping timeout: 248 seconds)
2026-07-05 12:22:47 +0000takuan(~takuan@d8d86b996.access.telenet.be)
2026-07-05 12:22:32 +0000takuan(~takuan@d8d86b996.access.telenet.be) (Read error: Connection reset by peer)
2026-07-05 12:21:05 +0000merijn(~merijn@host-cl.cgnat-g.v4.dfn.nl) merijn
2026-07-05 12:20:27 +0000eggplantade(~eggplanta@2600:1702:8450:c370:9436:d382:832a:2336) (Ping timeout: 254 seconds)
2026-07-05 12:14:59 +0000eggplantade(~eggplanta@2600:1702:8450:c370:9436:d382:832a:2336)