2026/03/07

Newest at the top

2026-03-07 12:08:11 +0100merijn(~merijn@host-cl.cgnat-g.v4.dfn.nl) merijn
2026-03-07 12:02:52 +0100 <Guest89> also I've been relying on using eventlog2html but it seems to break fairly easily. are there any other options for visualizing the profiles?
2026-03-07 12:02:32 +0100ChaiTRex(~ChaiTRex@user/chaitrex) ChaiTRex
2026-03-07 12:01:49 +0100 <Guest89> I have some plots from using -hc that tells me which functions allocate the most but to be honest they're not particularly surprising in that department
2026-03-07 12:01:03 +0100 <Guest89> i'll give it a whirl
2026-03-07 12:00:51 +0100 <Guest89> sorry, -h
2026-03-07 12:00:49 +0100 <Leary> -hT: https://downloads.haskell.org/ghc/latest/docs/users_guide/profiling.html#rts-options-heap-prof
2026-03-07 11:59:50 +0100 <Guest89> it's one of the -l(x) RTS settings
2026-03-07 11:59:28 +0100 <haskellbridge> <sm> how do you do that Leary ?
2026-03-07 11:59:22 +0100hiecaq(~hiecaq@user/hiecaq) (Quit: ERC 5.6.0.30.1 (IRC client for GNU Emacs 30.2))
2026-03-07 11:59:04 +0100 <Guest89> my reference implementation generates only a few megabytes of data by comparison but again it's not comparable 1:1
2026-03-07 11:58:57 +0100CiaoSen(~Jura@2a02:8071:64e1:da0:5a47:caff:fe78:33db) CiaoSen
2026-03-07 11:58:49 +0100ChaiTRex(~ChaiTRex@user/chaitrex) (Ping timeout: 258 seconds)
2026-03-07 11:58:11 +0100 <Guest89> the only thing I haven't tried has to force computations in different places
2026-03-07 11:57:34 +0100 <Guest89> https://paste.tomsmeding.com/xZZPhSCR
2026-03-07 11:57:29 +0100Beowulf(florian@sleipnir.bandrate.org)
2026-03-07 11:57:14 +0100 <lambdabot> Help us help you: please paste full code, input and/or output at e.g. https://paste.tomsmeding.com
2026-03-07 11:57:14 +0100 <Leary> @where paste
2026-03-07 11:56:45 +0100 <Leary> Guest89: The problem is less likely to be allocations than unnecessary retention or unwanted thunks bloating your representation. Allocating is almost free, holding onto it is what costs you. In any case, I would start by heap profiling by type, which doesn't actually require a profiling build.
2026-03-07 11:56:37 +0100 <Guest89> I guess I'll just write it down:
2026-03-07 11:55:58 +0100 <Guest89> not familiar
2026-03-07 11:55:46 +0100arandombit(~arandombi@user/arandombit) (Ping timeout: 248 seconds)
2026-03-07 11:55:46 +0100 <haskellbridge> <sm> it's a lot easier if you use the matrix room
2026-03-07 11:55:31 +0100 <Guest89> or should I just put it in writing
2026-03-07 11:55:26 +0100 <Guest89> is it possible to paste pictures over IRC?
2026-03-07 11:55:14 +0100 <haskellbridge> <sm> well, how much memory does +RTS -s say is being allocated ?
2026-03-07 11:53:30 +0100 <Guest89> I just don't see how garbage collection can dominate the runtime so much
2026-03-07 11:52:49 +0100 <haskellbridge> <loonycyborg> ye it's just an example
2026-03-07 11:52:40 +0100 <Guest89> most of the actual allocation happens within the core algorithms, but I've also tried varying levels of strictness there
2026-03-07 11:52:17 +0100 <Guest89> I've tried both strict and non-strict versions of foldl, and I didn't seem to see any issues
2026-03-07 11:51:53 +0100 <haskellbridge> <sm> it requires investigation, there's no way round it
2026-03-07 11:51:37 +0100 <haskellbridge> <loonycyborg> Maybe it's thunk explosion somewhere, like from foldl
2026-03-07 11:51:26 +0100 <Guest89> obviously you can't get the performance *that* close, at least not in terms of memory, but as things are it's both an absurd difference and not feasible
2026-03-07 11:50:56 +0100 <Guest89> I understand, but the issue is that my implementation uses something like 100 times as much memory as a reference implementation in C++
2026-03-07 11:50:51 +0100 <haskellbridge> <sm> the time and space profile will show you the top users of time and space, and will show you any crazy large call counts indicating things called too often
2026-03-07 11:50:38 +0100arandombit(~arandombi@user/arandombit) arandombit
2026-03-07 11:50:08 +0100 <haskellbridge> <sm> it may be normal that the program uses 2 or even 3 times what you think, depending how you measure it, because of how GC works (making copies)
2026-03-07 11:49:57 +0100Ranhir(~Ranhir@157.97.53.139) Ranhir
2026-03-07 11:49:28 +0100 <Guest89> it's supposed to be an implementation of I/O efficient BDD (binary decision diagram) algorithms which necessarily generates a lot of data so I need some way to minimize overhead where it's reasonable
2026-03-07 11:48:33 +0100 <Guest89> I assume there is some structure in my code that exacerbates the problem but I can't really see where
2026-03-07 11:48:10 +0100 <Guest89> I've already been trying to use GHC.Compact but it doesn't seem to have affected runtimes at all
2026-03-07 11:47:48 +0100 <Guest89> I have some data that suggests that the data itself isn't fragmented but the program allocates about twice as much as it uses, which also seems excessive
2026-03-07 11:47:46 +0100 <haskellbridge> <loonycyborg> https://github.com/ezyang/compact <- this might help with excessive gc times
2026-03-07 11:46:33 +0100 <haskellbridge> <sm> yes, +RTS -s is useful
2026-03-07 11:46:16 +0100 <Guest89> it's verbose but I can navigate it at least?
2026-03-07 11:45:59 +0100 <Guest89> I've mostly relied on the metrics provided by different RTS options so far
2026-03-07 11:45:32 +0100 <Guest89> I'll be honest, I haven't figured out how to interact with packages like that yet; I've used stuff like eventlog2html but compiled as a separate executable
2026-03-07 11:45:29 +0100 <haskellbridge> <sm> and you're right, it's overallocating (garbage collection should be a small percentage of your run time)
2026-03-07 11:44:12 +0100 <haskellbridge> <sm> it will be hard to understand completely, but full of useful information. You can also process it with https://hackage.haskell.org/package/profiterole which makes it more readable.
2026-03-07 11:42:04 +0100merijn(~merijn@host-cl.cgnat-g.v4.dfn.nl) (Ping timeout: 256 seconds)