Newest at the top
2024-09-21 21:26:49 +0200 | <tuxpaint> | i guess my program doesn't use any timers, so the fastest option would be to use `-V0` and disable the RTS clock? |
2024-09-21 21:24:50 +0200 | rvalue- | rvalue |
2024-09-21 21:23:29 +0200 | connrs | (~connrs@user/connrs) (Ping timeout: 248 seconds) |
2024-09-21 21:22:33 +0200 | <tuxpaint> | so i can compile with `-with-rtsopts="-V0.001"` and now it's the same. if i do `-V0.00001` it starts up even faster |
2024-09-21 21:20:42 +0200 | <tuxpaint> | monochrom: your hypothesis is somewhat confirmed i think. compiling with `rtsflags` and using `-V0.001` gets me the same startup time as -prof https://downloads.haskell.org/ghc/latest/docs/users_guide/profiling.html#rts-flag--V%20%E2%9F%A8se… |
2024-09-21 21:20:33 +0200 | merijn | (~merijn@204-220-045-062.dynamic.caiway.nl) (Ping timeout: 265 seconds) |
2024-09-21 21:20:23 +0200 | rekahsoft | (~rekahsoft@76.69.85.220) |
2024-09-21 21:19:11 +0200 | rekahsoft | (~rekahsoft@bras-base-orllon1103w-grc-06-76-69-85-220.dsl.bell.ca) (Remote host closed the connection) |
2024-09-21 21:17:49 +0200 | rvalue | (~rvalue@user/rvalue) (Ping timeout: 260 seconds) |
2024-09-21 21:17:10 +0200 | briandaed | (~root@185.234.210.211.r.toneticgroup.pl) (Remote host closed the connection) |
2024-09-21 21:16:39 +0200 | rvalue- | (~rvalue@user/rvalue) |
2024-09-21 21:15:15 +0200 | merijn | (~merijn@204-220-045-062.dynamic.caiway.nl) |
2024-09-21 21:15:10 +0200 | <geekosaur> | heh |
2024-09-21 21:14:56 +0200 | <monochrom> | Ugh the joke is on me. I lost an old exe that uses regex-base etc, now I have to recompile it again. |
2024-09-21 21:08:31 +0200 | <geekosaur> | overhead can still eat your lunch though… |
2024-09-21 21:08:06 +0200 | <tomsmeding> | tuxpaint: but when a haskell program is going, it's quite a bit faster than python ;) |
2024-09-21 21:08:01 +0200 | <tuxpaint> | regardless if it's correct to care about startup time, many people do, so maybe that is a good enough reason |
2024-09-21 21:07:56 +0200 | <geekosaur> | so now they've moved on to startup |
2024-09-21 21:07:44 +0200 | <geekosaur> | they already did a lot of qwork on shutdown time, which was kinda horrendous |
2024-09-21 21:07:32 +0200 | <tuxpaint> | 12ms puts it at the same speed as the perl/python scripts that do the same thing, which feels a little wrong. |
2024-09-21 21:07:29 +0200 | <geekosaur> | there is ongoing work on optimizing startup time, actually |
2024-09-21 21:07:12 +0200 | merijn | (~merijn@204-220-045-062.dynamic.caiway.nl) (Ping timeout: 272 seconds) |
2024-09-21 21:05:31 +0200 | <tomsmeding> | 12ms is a fairly long time |
2024-09-21 21:05:01 +0200 | <tomsmeding> | it's likely not top priority, and it shouldn't be |
2024-09-21 21:04:53 +0200 | <tomsmeding> | startup time is a valid thing to optimise for |
2024-09-21 21:04:32 +0200 | <dolio> | Should they, though? |
2024-09-21 21:03:54 +0200 | <tomsmeding> | ok I dunno, someone with knowledge about the RTS and the ability to print debug info from it (with timestamps) should investigate this, not me |
2024-09-21 21:03:24 +0200 | <tomsmeding> | but perf doesn't see much difference between the default and the profiling version |
2024-09-21 21:02:14 +0200 | <tomsmeding> | (the function is literally 5 instructions long: `endbr64` (control flow protection magic), `%ecx := 4096`, `%eax := 0`, `rep stos` which does essentially memset, `ret`) |
2024-09-21 21:01:37 +0200 | merijn | (~merijn@204-220-045-062.dynamic.caiway.nl) |
2024-09-21 21:00:41 +0200 | caconym | (~caconym@user/caconym) |
2024-09-21 21:00:02 +0200 | caconym | (~caconym@user/caconym) (Quit: bye) |
2024-09-21 20:59:20 +0200 | <tomsmeding> | and that function fills a 4KiB page with zeros |
2024-09-21 20:58:51 +0200 | <tomsmeding> | okay with `perf record -F max` (higher sample frequency), I reliably get that about 23% of the time is spent in a kernel function called clear_page_erms |
2024-09-21 20:57:32 +0200 | mreh | (~matthew@host86-146-25-125.range86-146.btcentralplus.com) |
2024-09-21 20:55:26 +0200 | <tomsmeding> | that's 1 sample |
2024-09-21 20:55:23 +0200 | <tomsmeding> | yeah right |
2024-09-21 20:55:21 +0200 | <tomsmeding> | it's like, 70% of your time is in this kernel function. Okay, so where? Well, 100% of the time in this kernel function is in this instruction: push %rbx |
2024-09-21 20:54:39 +0200 | <tomsmeding> | okay perf(1) is pointless here, it gets way too few samples |
2024-09-21 20:53:56 +0200 | srcd | (~srcd@93-140-131-235.adsl.net.t-com.hr) (Client Quit) |
2024-09-21 20:52:55 +0200 | srcd | (~srcd@93-140-131-235.adsl.net.t-com.hr) |
2024-09-21 20:51:35 +0200 | <tomsmeding> | (I hadn't expected it to) |
2024-09-21 20:51:27 +0200 | <tomsmeding> | (for the record: ghc -O also doesn't help >:D) |
2024-09-21 20:51:18 +0200 | <tomsmeding> | ~all of the time in perf(1) report output point to the kernel for me |
2024-09-21 20:51:07 +0200 | merijn | (~merijn@204-220-045-062.dynamic.caiway.nl) (Ping timeout: 264 seconds) |
2024-09-21 20:50:05 +0200 | <monochrom> | This is strange. But you look at "user" and "sys" and you see that the wallclock time just means the process is sleeping most of the time. Yeah what is it waiting for? |
2024-09-21 20:50:01 +0200 | <tomsmeding> | profiling this (with perf) is difficult because there's not much there :p |
2024-09-21 20:49:08 +0200 | ash3en | (~Thunderbi@2a03:7846:b6eb:101:93ac:a90a:da67:f207) (Ping timeout: 245 seconds) |
2024-09-21 20:47:42 +0200 | <tuxpaint> | -threaded made no difference for me either, yeah |
2024-09-21 20:47:34 +0200 | <tuxpaint> | yeah i'm basically comparing the startup time of a few different languages. it was odd that haskell is much worse than other compiled counterparts, but i guess somehow -prof changes something |