
Newest at the top

2024-09-21 21:28:43 +0200connrs(~connrs@user/connrs)
2024-09-21 21:28:11 +0200 <tomsmeding> ooh
2024-09-21 21:28:09 +0200 <tomsmeding> > The default is 0.001 seconds when profiling, and 0.01 otherwise.
2024-09-21 21:26:49 +0200 <tuxpaint> i guess my program doesn't use any timers, so the fastest option would be to use `-V0` and disable the RTS clock?
2024-09-21 21:24:50 +0200rvalue-rvalue
2024-09-21 21:23:29 +0200connrs(~connrs@user/connrs) (Ping timeout: 248 seconds)
2024-09-21 21:22:33 +0200 <tuxpaint> so i can compile with `-with-rtsopts="-V0.001"` and now it's the same. if i do `-V0.00001` it starts up even faster
2024-09-21 21:20:42 +0200 <tuxpaint> monochrom: your hypothesis is somewhat confirmed i think. compiling with `rtsflags` and using `-V0.001` gets me the same startup time as -prof https://downloads.haskell.org/ghc/latest/docs/users_guide/profiling.html#rts-flag--V%20%E2%9F%A8se…
2024-09-21 21:20:33 +0200merijn(~merijn@204-220-045-062.dynamic.caiway.nl) (Ping timeout: 265 seconds)
2024-09-21 21:20:23 +0200rekahsoft(~rekahsoft@
2024-09-21 21:19:11 +0200rekahsoft(~rekahsoft@bras-base-orllon1103w-grc-06-76-69-85-220.dsl.bell.ca) (Remote host closed the connection)
2024-09-21 21:17:49 +0200rvalue(~rvalue@user/rvalue) (Ping timeout: 260 seconds)
2024-09-21 21:17:10 +0200briandaed(~root@ (Remote host closed the connection)
2024-09-21 21:16:39 +0200rvalue-(~rvalue@user/rvalue)
2024-09-21 21:15:15 +0200merijn(~merijn@204-220-045-062.dynamic.caiway.nl)
2024-09-21 21:15:10 +0200 <geekosaur> heh
2024-09-21 21:14:56 +0200 <monochrom> Ugh the joke is on me. I lost an old exe that uses regex-base etc, now I have to recompile it again.
2024-09-21 21:08:31 +0200 <geekosaur> overhead can still eat your lunch though…
2024-09-21 21:08:06 +0200 <tomsmeding> tuxpaint: but when a haskell program is going, it's quite a bit faster than python ;)
2024-09-21 21:08:01 +0200 <tuxpaint> regardless if it's correct to care about startup time, many people do, so maybe that is a good enough reason
2024-09-21 21:07:56 +0200 <geekosaur> so now they've moved on to startup
2024-09-21 21:07:44 +0200 <geekosaur> they already did a lot of qwork on shutdown time, which was kinda horrendous
2024-09-21 21:07:32 +0200 <tuxpaint> 12ms puts it at the same speed as the perl/python scripts that do the same thing, which feels a little wrong.
2024-09-21 21:07:29 +0200 <geekosaur> there is ongoing work on optimizing startup time, actually
2024-09-21 21:07:12 +0200merijn(~merijn@204-220-045-062.dynamic.caiway.nl) (Ping timeout: 272 seconds)
2024-09-21 21:05:31 +0200 <tomsmeding> 12ms is a fairly long time
2024-09-21 21:05:01 +0200 <tomsmeding> it's likely not top priority, and it shouldn't be
2024-09-21 21:04:53 +0200 <tomsmeding> startup time is a valid thing to optimise for
2024-09-21 21:04:32 +0200 <dolio> Should they, though?
2024-09-21 21:03:54 +0200 <tomsmeding> ok I dunno, someone with knowledge about the RTS and the ability to print debug info from it (with timestamps) should investigate this, not me
2024-09-21 21:03:24 +0200 <tomsmeding> but perf doesn't see much difference between the default and the profiling version
2024-09-21 21:02:14 +0200 <tomsmeding> (the function is literally 5 instructions long: `endbr64` (control flow protection magic), `%ecx := 4096`, `%eax := 0`, `rep stos` which does essentially memset, `ret`)
2024-09-21 21:01:37 +0200merijn(~merijn@204-220-045-062.dynamic.caiway.nl)
2024-09-21 21:00:41 +0200caconym(~caconym@user/caconym)
2024-09-21 21:00:02 +0200caconym(~caconym@user/caconym) (Quit: bye)
2024-09-21 20:59:20 +0200 <tomsmeding> and that function fills a 4KiB page with zeros
2024-09-21 20:58:51 +0200 <tomsmeding> okay with `perf record -F max` (higher sample frequency), I reliably get that about 23% of the time is spent in a kernel function called clear_page_erms
2024-09-21 20:57:32 +0200mreh(~matthew@host86-146-25-125.range86-146.btcentralplus.com)
2024-09-21 20:55:26 +0200 <tomsmeding> that's 1 sample
2024-09-21 20:55:23 +0200 <tomsmeding> yeah right
2024-09-21 20:55:21 +0200 <tomsmeding> it's like, 70% of your time is in this kernel function. Okay, so where? Well, 100% of the time in this kernel function is in this instruction: push %rbx
2024-09-21 20:54:39 +0200 <tomsmeding> okay perf(1) is pointless here, it gets way too few samples
2024-09-21 20:53:56 +0200srcd(~srcd@93-140-131-235.adsl.net.t-com.hr) (Client Quit)
2024-09-21 20:52:55 +0200srcd(~srcd@93-140-131-235.adsl.net.t-com.hr)
2024-09-21 20:51:35 +0200 <tomsmeding> (I hadn't expected it to)
2024-09-21 20:51:27 +0200 <tomsmeding> (for the record: ghc -O also doesn't help >:D)
2024-09-21 20:51:18 +0200 <tomsmeding> ~all of the time in perf(1) report output point to the kernel for me
2024-09-21 20:51:07 +0200merijn(~merijn@204-220-045-062.dynamic.caiway.nl) (Ping timeout: 264 seconds)
2024-09-21 20:50:05 +0200 <monochrom> This is strange. But you look at "user" and "sys" and you see that the wallclock time just means the process is sleeping most of the time. Yeah what is it waiting for?
2024-09-21 20:50:01 +0200 <tomsmeding> profiling this (with perf) is difficult because there's not much there :p