2026/07/05

Newest at the top

2026-07-05 13:10:36 +0000	eggplantade	(~eggplanta@2600:1702:8450:c370:9436:d382:832a:2336)
2026-07-05 13:06:44 +0000	merijn	(~merijn@host-cl.cgnat-g.v4.dfn.nl) merijn
2026-07-05 13:01:36 +0000	<tomsmeding>	but there'll be a minority who has usecases for [a] that fall through the cracks of the optimisations and get worse, and they'll complain :)
2026-07-05 13:00:55 +0000	<tomsmeding>	I think a decently large portion of haskell users would be happy with such a change, because it's "obviously useful"
2026-07-05 13:00:07 +0000	<jaror>	We could store a simple RC bit in the constructor
2026-07-05 12:59:47 +0000	<jaror>	It could also take into account how many references there are to the elements to decide how to chunk
2026-07-05 12:59:39 +0000	<tomsmeding>	hm, probably less than double, but still more because of the duplication of the second-element pointer
2026-07-05 12:59:11 +0000	<tomsmeding>	(the chunks would need to be quite small, because you can have pathological cases where you create a Hd1 and then many copies of that list with one additional element added; that results in double the list memory overhead as when you'd do the same trick with normal [a])
2026-07-05 12:57:25 +0000	<tomsmeding>	-> possible, but significant additional administration
2026-07-05 12:57:15 +0000	<tomsmeding>	that's exactly what I was also thinking of
2026-07-05 12:57:04 +0000	<tomsmeding>	but there are bound to be uses of [a] in the wild that get disproportionally punished by List3 somehow
2026-07-05 12:56:52 +0000	<jaror>	I wonder if we could have special "offset" pointers which the GC can take into account and then split the chunks when collecting
2026-07-05 12:56:43 +0000	merijn	(~merijn@host-cl.cgnat-g.v4.dfn.nl) (Ping timeout: 264 seconds)
2026-07-05 12:56:22 +0000	eggplantade	(~eggplanta@2600:1702:8450:c370:9436:d382:832a:2336) (Ping timeout: 240 seconds)
2026-07-05 12:56:09 +0000	<jaror>	ah
2026-07-05 12:56:03 +0000	<tomsmeding>	(and which thus must be backwards-compatible)
2026-07-05 12:55:54 +0000	<tomsmeding>	(I'm talking about a hypothetical world where people would be willing to specially handle [a] in GHC and the RTS)
2026-07-05 12:55:30 +0000	<tomsmeding>	well, if the individual elements are large (but boxed so just a single pointer), then retaining even just one redundant element can be problematic
2026-07-05 12:55:00 +0000	<jaror>	And if you only chunk a few elements at a time (say 64 bytes) then I don't think there will be that many issues with retaining things for too long
2026-07-05 12:53:55 +0000	<tomsmeding>	at which point you get very close to your List3
2026-07-05 12:53:51 +0000	<jaror>	Yes
2026-07-05 12:53:40 +0000	<tomsmeding>	jaror: yes, so the proposal as stated falls flat, but you could easily imagine a lazy-bytestring-like chunked representation that does hvae the same asymptotics
2026-07-05 12:52:58 +0000	<tomsmeding>	but I do wonder if there is something that GHC could, potentially, do here in terms of chunking lists; the issue is that it would introduce a bunch of additional administration, in particular to ensure that list elements can be properly GC'd when they're not referenced any more, always
2026-07-05 12:52:57 +0000	<jaror>	I mean, text is asymptotically slower for many things
2026-07-05 12:51:56 +0000	<tomsmeding>	I think that suggestion was mostly dismissed as "but the point of [a] is that it's a list"
2026-07-05 12:51:52 +0000	merijn	(~merijn@host-cl.cgnat-g.v4.dfn.nl) merijn
2026-07-05 12:51:14 +0000	eggplantade	(~eggplanta@2600:1702:8450:c370:9436:d382:832a:2336)
2026-07-05 12:50:55 +0000	<tomsmeding>	I think someone in this channel at some point proposed that GHC ought to implement String more efficiently, transparently using a Text-like representation under the hood
2026-07-05 12:50:23 +0000	lisbeths	(uid135845@id-135845.lymington.irccloud.com) (Quit: Connection closed for inactivity)
2026-07-05 12:49:55 +0000	<tomsmeding>	(to nobody's surprise)
2026-07-05 12:49:46 +0000	<tomsmeding>	"computers like arrays"
2026-07-05 12:49:33 +0000	<tomsmeding>	nice, that confirms the hypothesis about why traversal was faster
2026-07-05 12:42:12 +0000	ftzm427	ftzm
2026-07-05 12:42:09 +0000	ftzm427	(~ftzm@85.80.244.25)
2026-07-05 12:41:55 +0000	<jaror>	but obviously does not give you O(log n) access or length computation
2026-07-05 12:41:30 +0000	<jaror>	That has less construction overhead and more summing gains
2026-07-05 12:41:19 +0000	merijn	(~merijn@host-cl.cgnat-g.v4.dfn.nl) (Ping timeout: 276 seconds)
2026-07-05 12:41:07 +0000	Vizious	(~bes@user/Vizious) Vizious
2026-07-05 12:40:46 +0000	<jaror>	data List3Tail a = Nil3 \| Cons3 a a a !(List3Tail a)
2026-07-05 12:40:46 +0000	<jaror>	data List3 a = Hd0 !(List3Tail a) \| Hd1 a !(List3Tail a) \| Hd2 a a !(List3Tail a)
2026-07-05 12:40:46 +0000	<jaror>	I've now also benchmarked a simple list with 3 element cells:
2026-07-05 12:39:45 +0000	lnxxcya	(~lynxx@i577BF0A5.versanet.de)
2026-07-05 12:38:29 +0000	eggplantade	(~eggplanta@2600:1702:8450:c370:9436:d382:832a:2336) (Ping timeout: 242 seconds)
2026-07-05 12:36:26 +0000	merijn	(~merijn@host-cl.cgnat-g.v4.dfn.nl) merijn
2026-07-05 12:32:46 +0000	eggplantade	(~eggplanta@2600:1702:8450:c370:9436:d382:832a:2336)
2026-07-05 12:25:45 +0000	merijn	(~merijn@host-cl.cgnat-g.v4.dfn.nl) (Ping timeout: 248 seconds)
2026-07-05 12:22:47 +0000	takuan	(~takuan@d8d86b996.access.telenet.be)
2026-07-05 12:22:32 +0000	takuan	(~takuan@d8d86b996.access.telenet.be) (Read error: Connection reset by peer)
2026-07-05 12:21:05 +0000	merijn	(~merijn@host-cl.cgnat-g.v4.dfn.nl) merijn
2026-07-05 12:20:27 +0000	eggplantade	(~eggplanta@2600:1702:8450:c370:9436:d382:832a:2336) (Ping timeout: 254 seconds)