2022/06/22

2022-06-22 00:02:07 +0200__monty__(~toonn@user/toonn) (Quit: leaving)
2022-06-22 00:05:06 +0200winny(~weechat@user/winny)
2022-06-22 00:06:06 +0200bontaq(~user@ool-45779fe5.dyn.optonline.net) (Ping timeout: 276 seconds)
2022-06-22 00:06:23 +0200michalz(~michalz@185.246.204.107) (Remote host closed the connection)
2022-06-22 00:11:07 +0200takuan(~takuan@178-116-218-225.access.telenet.be) (Remote host closed the connection)
2022-06-22 00:14:09 +0200n1essa(~nessa@75-164-218-34.ptld.qwest.net) (Quit: leaving)
2022-06-22 00:15:04 +0200k8yun(~k8yun@user/k8yun) (Read error: Connection reset by peer)
2022-06-22 00:15:42 +0200acidjnk_new(~acidjnk@dynamic-046-114-169-114.46.114.pool.telefonica.de) (Ping timeout: 264 seconds)
2022-06-22 00:17:05 +0200odnes(~odnes@5-203-249-68.pat.nym.cosmote.net) (Remote host closed the connection)
2022-06-22 00:19:14 +0200money(~Gambino@user/polo)
2022-06-22 00:20:27 +0200rito_(~rito_gh@45.112.243.199) (Ping timeout: 256 seconds)
2022-06-22 00:25:37 +0200yauhsien(~yauhsien@61-231-23-53.dynamic-ip.hinet.net)
2022-06-22 00:29:50 +0200yauhsien(~yauhsien@61-231-23-53.dynamic-ip.hinet.net) (Ping timeout: 240 seconds)
2022-06-22 00:30:33 +0200mikoto-chan(~mikoto-ch@esm-84-240-99-143.netplaza.fi)
2022-06-22 00:31:20 +0200moonsheep(~user@user/moonsheep)
2022-06-22 00:31:30 +0200 <moonsheep> Hi there again!
2022-06-22 00:31:43 +0200 <moonsheep> I'm trying to install accelerate, and I've installed llvm 9 from source.
2022-06-22 00:32:05 +0200 <moonsheep> Now when I try to build my project, accelerate-llvm reports the following: `<command line>: libLLVMXRay.so.9: cannot open shared object file: No such file or directory`
2022-06-22 00:32:24 +0200 <moonsheep> Yet if I go look under /usr/local/lib it is clearly there
2022-06-22 00:32:52 +0200 <moonsheep> `llvm-config --libdir` does indeed return `/usr/local/lib`
2022-06-22 00:34:17 +0200money(~Gambino@user/polo) (Read error: Connection reset by peer)
2022-06-22 00:37:30 +0200Guest3106(~Gambino@user/polo)
2022-06-22 00:38:10 +0200jgeerds(~jgeerds@55d45f48.access.ecotel.net) (Ping timeout: 240 seconds)
2022-06-22 00:38:13 +0200 <geekosaur> llvm-config won't help here, either /usr/local/lib needs to be listed in /etc/ld.so.conf (or under /etc/ld.so.conf.d, on ubuntu) or you need to arrange for accelerate-llvm to use -R
2022-06-22 00:38:40 +0200eggplantade(~Eggplanta@2600:1700:bef1:5e10:9:59a4:5055:fd8c) (Remote host closed the connection)
2022-06-22 00:38:51 +0200Guest3106(~Gambino@user/polo) (Remote host closed the connection)
2022-06-22 00:39:39 +0200 <monochrom> And you probably still need to run "sudo ldconfig" because it is /etc/ld.so.cache that is consulted at run time, rather that a real search.
2022-06-22 00:40:20 +0200Polo__(~Gambino@user/polo)
2022-06-22 00:40:35 +0200Polo__(~Gambino@user/polo) (Read error: Connection reset by peer)
2022-06-22 00:42:30 +0200 <geekosaur> yes
2022-06-22 00:42:36 +0200 <geekosaur> (sorry, making dinner)
2022-06-22 00:42:43 +0200money(~Gambino@user/polo)
2022-06-22 00:43:08 +0200 <monochrom> Ah, I'm a spoiled kid, I just order through Ubereats and continue to IRC :)
2022-06-22 00:45:08 +0200Guest9447(~Gambino@user/polo)
2022-06-22 00:45:31 +0200Guest9447(~Gambino@user/polo) (Client Quit)
2022-06-22 00:46:25 +0200 <EvanR> dammit don't temp me
2022-06-22 00:46:32 +0200jmdaemon(~jmdaemon@user/jmdaemon) (Quit: ZNC 1.8.2 - https://znc.in)
2022-06-22 00:47:29 +0200money(~Gambino@user/polo) (Ping timeout: 246 seconds)
2022-06-22 00:47:39 +0200jmdaemon(~jmdaemon@user/jmdaemon)
2022-06-22 00:49:17 +0200 <moonsheep> geekosaur: I did that doesn't seem to have any effect
2022-06-22 00:49:31 +0200 <moonsheep> My /etc/ld.so.conf just loads all the files under /etc/ld.so.conf.d
2022-06-22 00:49:42 +0200 <moonsheep> I added one that has /usr/local/lib
2022-06-22 00:49:57 +0200alp__(~alp@user/alp) (Ping timeout: 268 seconds)
2022-06-22 00:50:12 +0200 <geekosaur> is it named <whatever>.conf? and did you run `sudo ldconfig` afterward like monochrom said?
2022-06-22 00:50:14 +0200 <moonsheep> And still accelerate fails to build
2022-06-22 00:50:27 +0200 <moonsheep> It is named llvm9.conf and yes I did
2022-06-22 00:50:38 +0200 <moonsheep> I even tried manually removing the cache file but it didn't seem to help
2022-06-22 00:50:55 +0200 <geekosaur> uh., that sounds like a good way to break your system
2022-06-22 00:50:55 +0200 <moonsheep> Oh wait my bad, I'm blin
2022-06-22 00:51:01 +0200 <moonsheep> It's a different error
2022-06-22 00:51:19 +0200 <moonsheep> [a very long path]-ghc8.10.7.so: undefined symbol: _ZTIN4llvm13ErrorInfoBaseE
2022-06-22 00:51:30 +0200 <moonsheep> I am supposed to use llvm 9.0.1 right?
2022-06-22 00:52:08 +0200 <moonsheep> So I guess now it can find the llibrary but it fails to link with ti
2022-06-22 00:52:56 +0200 <geekosaur> sounds like it, but the question is what is failing to link with it
2022-06-22 00:53:12 +0200 <geekosaur> "a very long path" minus the long hash at the end
2022-06-22 00:53:55 +0200 <moonsheep> /home/moonsheep/.stack/snapshots/x86_64-linux-tinfo6/<hash>/8.10.7/lib/x86_64-linux-ghc-8.10.7/libHSllvm-hs-9.0.1-S639BV4lBwDq2AVMyPWFd-ghc8.10.7.so: undefined symbol: _ZTIN4llvm13ErrorInfoBaseE
2022-06-22 00:54:38 +0200gurkenglas(~gurkengla@dslb-002-207-014-022.002.207.pools.vodafone-ip.de) (Ping timeout: 244 seconds)
2022-06-22 00:55:10 +0200 <geekosaur> odd. I think you need someone familiar with accelerate-llvm at this point
2022-06-22 00:55:10 +0200zeenk(~zeenk@2a02:2f04:a301:3d00:39df:1c4b:8a55:48d3) (Quit: Konversation terminated!)
2022-06-22 00:55:38 +0200 <moonsheep> Hmm, maybe I should try purging everything
2022-06-22 00:55:52 +0200jmdaemon(~jmdaemon@user/jmdaemon) (Quit: ZNC 1.8.2 - https://znc.in)
2022-06-22 00:55:53 +0200cheater(~Username@user/cheater) (Ping timeout: 248 seconds)
2022-06-22 00:56:25 +0200cheater(~Username@user/cheater)
2022-06-22 00:58:00 +0200moonsheep(~user@user/moonsheep) (Remote host closed the connection)
2022-06-22 00:58:12 +0200 <geekosaur> https://discourse.llvm.org/t/lost-ztin4llvm13errorinfobasee-symbol/3077
2022-06-22 00:58:46 +0200 <geekosaur> sounds like youu need to rebuild llvm with -DENABLE_LLVM_RTTI=ON
2022-06-22 00:59:00 +0200alp__(~alp@user/alp)
2022-06-22 01:00:00 +0200 <geekosaur> I thought that nanme looked mangled but I wasn't expecting c++ mangling searched for by haskell
2022-06-22 01:03:59 +0200nate4(~nate@98.45.169.16)
2022-06-22 01:04:41 +0200 <geekosaur> oh, they left
2022-06-22 01:05:12 +0200 <geekosaur> @tell moonsheep per https://discourse.llvm.org/t/lost-ztin4llvm13errorinfobasee-symbol/3077 you need to rebuild llvm with -DENABLE_LLVM_RTTI=ON
2022-06-22 01:05:13 +0200 <lambdabot> Consider it noted.
2022-06-22 01:06:40 +0200chomwitt(~chomwitt@2a02:587:dc0d:e600:d03e:b48f:9497:fc81) (Remote host closed the connection)
2022-06-22 01:08:22 +0200moonsheep(~user@user/moonsheep)
2022-06-22 01:08:41 +0200nate4(~nate@98.45.169.16) (Ping timeout: 248 seconds)
2022-06-22 01:08:52 +0200 <moonsheep> geekosaur: ah thanks I'll try that now
2022-06-22 01:08:58 +0200 <moonsheep> Yeah sorry for leaving I tried rebooting
2022-06-22 01:09:37 +0200 <geekosaur[m]> No worries
2022-06-22 01:09:42 +0200stiell(~stiell@gateway/tor-sasl/stiell) (Ping timeout: 268 seconds)
2022-06-22 01:11:46 +0200 <moonsheep> Hmm, cmake tells me that CMake Warning:
2022-06-22 01:11:46 +0200 <moonsheep> Manually-specified variables were not used by the project:
2022-06-22 01:11:46 +0200 <moonsheep> ENABLE_LLVM_RTTI
2022-06-22 01:11:54 +0200 <moonsheep> Oops didn't mean to paste like that
2022-06-22 01:13:01 +0200 <moonsheep> Ah it's actually called `LLVM_ENABLE_RTTI`
2022-06-22 01:21:33 +0200eggplantade(~Eggplanta@2600:1700:bef1:5e10:9:59a4:5055:fd8c)
2022-06-22 01:23:10 +0200Lord_of_Life(~Lord@user/lord-of-life/x-2819915) (Ping timeout: 240 seconds)
2022-06-22 01:23:22 +0200Lord_of_Life_(~Lord@user/lord-of-life/x-2819915)
2022-06-22 01:23:27 +0200quarkyalice(~quarkyali@user/quarkyalice) (Quit: quarkyalice)
2022-06-22 01:24:36 +0200Lord_of_Life_Lord_of_Life
2022-06-22 01:26:46 +0200quarkyalice(~quarkyali@user/quarkyalice)
2022-06-22 01:27:00 +0200AlexNoo_(~AlexNoo@178.34.160.206)
2022-06-22 01:29:01 +0200AlexZenon(~alzenon@94.233.240.20) (Ping timeout: 256 seconds)
2022-06-22 01:29:34 +0200Tuplanolla(~Tuplanoll@91-159-69-97.elisa-laajakaista.fi) (Quit: Leaving.)
2022-06-22 01:30:43 +0200Alex_test(~al_test@94.233.240.20) (Ping timeout: 256 seconds)
2022-06-22 01:30:43 +0200AlexNoo(~AlexNoo@94.233.240.20) (Ping timeout: 256 seconds)
2022-06-22 01:31:21 +0200BusConscious(~martin@ip5f5bdedc.dynamic.kabel-deutschland.de) (Remote host closed the connection)
2022-06-22 01:32:19 +0200stackdroid18(14094@user/stackdroid)
2022-06-22 01:32:52 +0200AlexZenon(~alzenon@178.34.160.206)
2022-06-22 01:32:54 +0200stiell(~stiell@gateway/tor-sasl/stiell)
2022-06-22 01:33:00 +0200money(~Gambino@user/polo)
2022-06-22 01:34:34 +0200Alex_test(~al_test@178.34.160.206)
2022-06-22 01:37:00 +0200mixfix41(~sdenynine@user/mixfix41)
2022-06-22 01:37:31 +0200money(~Gambino@user/polo) ()
2022-06-22 01:38:53 +0200pavonia(~user@user/siracusa)
2022-06-22 01:40:21 +0200tv(~tv@user/tv) (Ping timeout: 256 seconds)
2022-06-22 01:46:16 +0200 <moonsheep> Oh forgot to report back here: it worked beautifully!
2022-06-22 01:46:18 +0200 <moonsheep> Thank you very much
2022-06-22 01:46:32 +0200 <moonsheep> In case anyone is interested, this is my full cmake command:
2022-06-22 01:46:33 +0200 <moonsheep> cmake .. -DCMAKE_BUILD_TYPE=Release -GNinja -DLLVM_BUILD_LLVM_DYLIB=ON -DLLVM_ENABLE_RTTI=ON -DBUILD_SHARED_LIBS=ON
2022-06-22 01:46:43 +0200 <moonsheep> llvm 9.0.1
2022-06-22 01:47:00 +0200moonsheep(~user@user/moonsheep) (ERC 5.4 (IRC client for GNU Emacs 28.1))
2022-06-22 01:47:56 +0200cosimone(~user@93-44-186-171.ip98.fastwebnet.it) (Read error: Connection reset by peer)
2022-06-22 01:49:45 +0200k8yun(~k8yun@user/k8yun)
2022-06-22 01:52:21 +0200rkk(~rkk@2601:547:b01:53f3:df8:eb6f:ddf9:a41f)
2022-06-22 01:52:33 +0200tv(~tv@user/tv)
2022-06-22 01:55:08 +0200k8yun(~k8yun@user/k8yun) (Quit: Leaving)
2022-06-22 01:55:40 +0200rkk(~rkk@2601:547:b01:53f3:df8:eb6f:ddf9:a41f) (Remote host closed the connection)
2022-06-22 02:00:09 +0200juri__(~juri@79.140.115.124)
2022-06-22 02:01:29 +0200[itchyjunk](~itchyjunk@user/itchyjunk/x-7353470) (Ping timeout: 248 seconds)
2022-06-22 02:02:29 +0200jmdaemon(~jmdaemon@user/jmdaemon)
2022-06-22 02:02:58 +0200quarkyalice(~quarkyali@user/quarkyalice) (Quit: quarkyalice)
2022-06-22 02:03:05 +0200juri_(~juri@79.140.115.72) (Ping timeout: 248 seconds)
2022-06-22 02:05:40 +0200[itchyjunk](~itchyjunk@user/itchyjunk/x-7353470)
2022-06-22 02:06:07 +0200quarkyalice(~quarkyali@user/quarkyalice)
2022-06-22 02:13:58 +0200vysn(~vysn@user/vysn)
2022-06-22 02:21:04 +0200 <Axman6> dsal: isn't that just sequence?
2022-06-22 02:21:28 +0200 <Axman6> > sequence ["ABC","TUV","XYZ"]
2022-06-22 02:21:30 +0200 <lambdabot> ["ATX","ATY","ATZ","AUX","AUY","AUZ","AVX","AVY","AVZ","BTX","BTY","BTZ","BU...
2022-06-22 02:27:27 +0200td_(~td@muedsl-82-207-238-103.citykom.de)
2022-06-22 02:29:28 +0200 <dsal> Axman6: It is in that case, I think, but this was groups of three of two things.
2022-06-22 02:29:42 +0200 <dsal> > sequence [[True, False]]
2022-06-22 02:29:44 +0200 <lambdabot> [[True],[False]]
2022-06-22 02:30:13 +0200 <dsal> > sequence [[True], [False]]
2022-06-22 02:30:15 +0200 <lambdabot> [[True,False]]
2022-06-22 02:30:43 +0200 <dsal> I can't make that exciting.
2022-06-22 02:30:50 +0200pleo(~pleo@user/pleo) (Ping timeout: 240 seconds)
2022-06-22 02:35:27 +0200jmcarthur(~jmcarthur@c-73-29-224-10.hsd1.nj.comcast.net)
2022-06-22 02:36:06 +0200jmcarthur(~jmcarthur@c-73-29-224-10.hsd1.nj.comcast.net) (Client Quit)
2022-06-22 02:38:38 +0200pretty_dumm_guy(trottel@gateway/vpn/protonvpn/prettydummguy/x-88029655) (Ping timeout: 240 seconds)
2022-06-22 02:40:45 +0200esrh(~user@res404s-128-61-105-50.res.gatech.edu)
2022-06-22 02:42:12 +0200 <jackdk> it alternates, that's exciting!
2022-06-22 02:46:11 +0200 <dsal> > sequence [[True, False]]
2022-06-22 02:46:13 +0200 <lambdabot> [[True],[False]]
2022-06-22 02:46:17 +0200xff0x(~xff0x@b133147.ppp.asahi-net.or.jp) (Ping timeout: 248 seconds)
2022-06-22 02:46:28 +0200 <hpc> it'd be more fun if it was like that for all inputs
2022-06-22 02:46:30 +0200 <dsal> My attention span is short enough that I just typed up a thing I already typed up to try it.
2022-06-22 02:46:53 +0200 <dsal> > replicateM 3 ["ABC", "TUV", "XYZ"]
2022-06-22 02:46:55 +0200 <Axman6> It could've changed in that time
2022-06-22 02:46:55 +0200 <lambdabot> [["ABC","ABC","ABC"],["ABC","ABC","TUV"],["ABC","ABC","XYZ"],["ABC","TUV","A...
2022-06-22 02:47:16 +0200 <dsal> > replicateM 3 "ABC" -- I guess at this point, it's just permutations of 3
2022-06-22 02:47:18 +0200 <lambdabot> ["AAA","AAB","AAC","ABA","ABB","ABC","ACA","ACB","ACC","BAA","BAB","BAC","BB...
2022-06-22 02:50:12 +0200quarkyalice(~quarkyali@user/quarkyalice) (Remote host closed the connection)
2022-06-22 02:51:21 +0200rkk(~rkk@2601:547:b01:53f3:df8:eb6f:ddf9:a41f)
2022-06-22 02:51:43 +0200quarkyalice(~quarkyali@user/quarkyalice)
2022-06-22 02:55:03 +0200quarkyalice(~quarkyali@user/quarkyalice) (Client Quit)
2022-06-22 02:57:06 +0200rkk(~rkk@2601:547:b01:53f3:df8:eb6f:ddf9:a41f) (Quit: Leaving)
2022-06-22 02:59:23 +0200cheater1__(~Username@user/cheater)
2022-06-22 02:59:30 +0200cheater(~Username@user/cheater) (Ping timeout: 264 seconds)
2022-06-22 02:59:37 +0200cheater1__cheater
2022-06-22 03:01:58 +0200machinedgod(~machinedg@66.244.246.252) (Ping timeout: 240 seconds)
2022-06-22 03:03:02 +0200cheater(~Username@user/cheater) (Client Quit)
2022-06-22 03:03:47 +0200cheater(~Username@user/cheater)
2022-06-22 03:04:02 +0200Guest27(~Guest27@2601:281:d47f:1590::2df)
2022-06-22 03:04:32 +0200notzmv(~zmv@user/notzmv) (Ping timeout: 255 seconds)
2022-06-22 03:09:12 +0200moet(~moet@mobile-166-177-248-235.mycingular.net)
2022-06-22 03:09:32 +0200moet(~moet@mobile-166-177-248-235.mycingular.net) (Client Quit)
2022-06-22 03:10:32 +0200 <Guest27> Is Cabal supposed to cache ghc options by default? If I run `cabal build --ghc-options -ddump-splices`, future builds will always dump the splices even without any ghc options passed until running `cabal clean`
2022-06-22 03:18:35 +0200nate4(~nate@98.45.169.16)
2022-06-22 03:20:22 +0200Guest27(~Guest27@2601:281:d47f:1590::2df) (Quit: Client closed)
2022-06-22 03:23:30 +0200nate4(~nate@98.45.169.16) (Ping timeout: 268 seconds)
2022-06-22 03:24:17 +0200stefan-_(~cri@42dots.de) (Ping timeout: 246 seconds)
2022-06-22 03:28:03 +0200yauhsien(~yauhsien@61-231-23-53.dynamic-ip.hinet.net)
2022-06-22 03:28:42 +0200stackdroid18(14094@user/stackdroid) (Quit: Lost terminal)
2022-06-22 03:29:06 +0200stefan-_(~cri@42dots.de)
2022-06-22 03:30:39 +0200xff0x(~xff0x@125x103x176x34.ap125.ftth.ucom.ne.jp)
2022-06-22 03:33:13 +0200alp__(~alp@user/alp) (Ping timeout: 248 seconds)
2022-06-22 03:34:15 +0200esrh(~user@res404s-128-61-105-50.res.gatech.edu) (Remote host closed the connection)
2022-06-22 03:50:01 +0200yauhsien(~yauhsien@61-231-23-53.dynamic-ip.hinet.net) (Ping timeout: 268 seconds)
2022-06-22 03:56:12 +0200stiell(~stiell@gateway/tor-sasl/stiell) (Ping timeout: 268 seconds)
2022-06-22 04:04:33 +0200nibelungen(~asturias@2001:19f0:7001:638:5400:3ff:fef3:8725)
2022-06-22 04:07:04 +0200kimjetwav(~user@2607:fea8:2340:da00:1282:4dfa:aaca:27db)
2022-06-22 04:09:51 +0200stiell(~stiell@gateway/tor-sasl/stiell)
2022-06-22 04:20:36 +0200FinnElija(~finn_elij@user/finn-elija/x-0085643) (Killed (NickServ (Forcing logout FinnElija -> finn_elija)))
2022-06-22 04:20:36 +0200finn_elija(~finn_elij@user/finn-elija/x-0085643)
2022-06-22 04:20:36 +0200finn_elijaFinnElija
2022-06-22 04:26:31 +0200Unicorn_Princess(~Unicorn_P@93-103-228-248.dynamic.t-2.net) (Remote host closed the connection)
2022-06-22 04:27:24 +0200frost(~frost@user/frost)
2022-06-22 04:36:38 +0200liz(~liz@cpc84585-newc17-2-0-cust60.16-2.cable.virginm.net) (Ping timeout: 240 seconds)
2022-06-22 04:37:48 +0200jao(~jao@cpc103048-sgyl39-2-0-cust502.18-2.cable.virginm.net) (Ping timeout: 276 seconds)
2022-06-22 04:39:45 +0200mikoto-chan(~mikoto-ch@esm-84-240-99-143.netplaza.fi) (Ping timeout: 276 seconds)
2022-06-22 04:48:00 +0200yrlnry(~yrlnry@pool-108-2-150-109.phlapa.fios.verizon.net) (Remote host closed the connection)
2022-06-22 04:56:26 +0200esrh(~user@res404s-128-61-105-50.res.gatech.edu)
2022-06-22 05:02:30 +0200td_(~td@muedsl-82-207-238-103.citykom.de) (Ping timeout: 264 seconds)
2022-06-22 05:04:09 +0200td_(~td@muedsl-82-207-238-203.citykom.de)
2022-06-22 05:04:18 +0200vysn(~vysn@user/vysn) (Ping timeout: 264 seconds)
2022-06-22 05:14:43 +0200notzmv(~zmv@user/notzmv)
2022-06-22 05:16:35 +0200waleee(~waleee@2001:9b0:213:7200:cc36:a556:b1e8:b340) (Ping timeout: 244 seconds)
2022-06-22 05:18:37 +0200yrlnry(~yrlnry@pool-108-2-150-109.phlapa.fios.verizon.net)
2022-06-22 05:21:07 +0200nate4(~nate@98.45.169.16)
2022-06-22 05:22:45 +0200z0k(~z0k@206.84.141.12)
2022-06-22 05:22:58 +0200yrlnry(~yrlnry@pool-108-2-150-109.phlapa.fios.verizon.net) (Ping timeout: 240 seconds)
2022-06-22 05:41:39 +0200stiell(~stiell@gateway/tor-sasl/stiell) (Ping timeout: 268 seconds)
2022-06-22 05:42:12 +0200stiell(~stiell@gateway/tor-sasl/stiell)
2022-06-22 05:42:23 +0200[itchyjunk](~itchyjunk@user/itchyjunk/x-7353470) (Remote host closed the connection)
2022-06-22 05:43:19 +0200esrh(~user@res404s-128-61-105-50.res.gatech.edu) (Remote host closed the connection)
2022-06-22 05:49:05 +0200yauhsien(~yauhsien@61-231-23-53.dynamic-ip.hinet.net)
2022-06-22 06:05:38 +0200nate4(~nate@98.45.169.16) (Ping timeout: 246 seconds)
2022-06-22 06:09:39 +0200Vajb(~Vajb@hag-jnsbng11-58c3a8-176.dhcp.inet.fi) (Read error: Connection reset by peer)
2022-06-22 06:10:09 +0200Vajb(~Vajb@2001:999:40:4c50:1b24:879c:6df3:1d06)
2022-06-22 06:14:02 +0200Sgeo_(~Sgeo@user/sgeo)
2022-06-22 06:14:04 +0200Kaipei(~Kaiepi@156.34.47.253)
2022-06-22 06:14:34 +0200Feuermagier_(~Feuermagi@138.199.36.237)
2022-06-22 06:14:42 +0200apache2(apache2@anubis.0x90.dk)
2022-06-22 06:15:14 +0200Katarushisu4(~Katarushi@cpc147334-finc20-2-0-cust27.4-2.cable.virginm.net)
2022-06-22 06:15:23 +0200EsoAlgo1(~EsoAlgo@129.146.136.145)
2022-06-22 06:15:32 +0200elkcl_(~elkcl@broadband-37-110-156-162.ip.moscow.rt.ru)
2022-06-22 06:15:36 +0200ulvarref`(~user@188.124.56.153)
2022-06-22 06:15:47 +0200Natch(~natch@c-9e07225c.038-60-73746f7.bbcust.telenor.se) (Ping timeout: 246 seconds)
2022-06-22 06:15:47 +0200lambdabot(~lambdabot@haskell/bot/lambdabot) (Ping timeout: 246 seconds)
2022-06-22 06:16:29 +0200AlexZenon(~alzenon@178.34.160.206) (Ping timeout: 246 seconds)
2022-06-22 06:16:29 +0200elkcl(~elkcl@broadband-37-110-156-162.ip.moscow.rt.ru) (Ping timeout: 246 seconds)
2022-06-22 06:16:29 +0200Henkru(~henkru@kapsi.fi) (Ping timeout: 246 seconds)
2022-06-22 06:16:29 +0200elkcl_elkcl
2022-06-22 06:16:50 +0200tv(~tv@user/tv) (Ping timeout: 246 seconds)
2022-06-22 06:16:50 +0200gentauro(~gentauro@user/gentauro) (Ping timeout: 246 seconds)
2022-06-22 06:16:50 +0200drewr(~drew@user/drewr) (Ping timeout: 246 seconds)
2022-06-22 06:16:50 +0200Katarushisu(~Katarushi@cpc147334-finc20-2-0-cust27.4-2.cable.virginm.net) (Ping timeout: 246 seconds)
2022-06-22 06:16:50 +0200turlando(~turlando@user/turlando) (Ping timeout: 246 seconds)
2022-06-22 06:16:50 +0200EsoAlgo(~EsoAlgo@129.146.136.145) (Ping timeout: 246 seconds)
2022-06-22 06:16:50 +0200echoreply(~echoreply@45.32.163.16) (Ping timeout: 246 seconds)
2022-06-22 06:16:50 +0200Guest1698(~Guest1698@20.83.116.49) (Ping timeout: 246 seconds)
2022-06-22 06:16:50 +0200Katarushisu4Katarushisu
2022-06-22 06:16:50 +0200EsoAlgo1EsoAlgo
2022-06-22 06:17:11 +0200Kaiepi(~Kaiepi@156.34.47.253) (Ping timeout: 246 seconds)
2022-06-22 06:17:11 +0200Sgeo(~Sgeo@user/sgeo) (Ping timeout: 246 seconds)
2022-06-22 06:17:11 +0200ulvarrefr(~user@188.124.56.153) (Ping timeout: 246 seconds)
2022-06-22 06:17:11 +0200apache(apache2@anubis.0x90.dk) (Ping timeout: 246 seconds)
2022-06-22 06:17:11 +0200Dykam(Dykam@dykam.nl) (Ping timeout: 246 seconds)
2022-06-22 06:17:11 +0200Feuermagier(~Feuermagi@user/feuermagier) (Ping timeout: 246 seconds)
2022-06-22 06:17:11 +0200ezzieyguywuf(~Unknown@user/ezzieyguywuf) (Ping timeout: 246 seconds)
2022-06-22 06:17:11 +0200hughjfchen(~hughjfche@vmi556545.contaboserver.net) (Ping timeout: 246 seconds)
2022-06-22 06:17:11 +0200JimL(~quassel@89-162-2-132.fiber.signal.no) (Ping timeout: 246 seconds)
2022-06-22 06:17:25 +0200turlando(~turlando@93.51.40.51)
2022-06-22 06:17:25 +0200turlando(~turlando@93.51.40.51) (Changing host)
2022-06-22 06:17:25 +0200turlando(~turlando@user/turlando)
2022-06-22 06:17:43 +0200lambdabot(~lambdabot@silicon.int-e.eu)
2022-06-22 06:17:43 +0200lambdabot(~lambdabot@silicon.int-e.eu) (Changing host)
2022-06-22 06:17:43 +0200lambdabot(~lambdabot@haskell/bot/lambdabot)
2022-06-22 06:17:48 +0200Dykam(Dykam@dykam.nl)
2022-06-22 06:17:49 +0200JimL(~quassel@89-162-2-132.fiber.signal.no)
2022-06-22 06:18:22 +0200Henkru(henkru@kapsi.fi)
2022-06-22 06:18:54 +0200gentauro(~gentauro@user/gentauro)
2022-06-22 06:19:06 +0200hughjfchen(~hughjfche@vmi556545.contaboserver.net)
2022-06-22 06:19:07 +0200ezzieyguywuf(~Unknown@user/ezzieyguywuf)
2022-06-22 06:20:37 +0200odnes(~odnes@5-203-249-68.pat.nym.cosmote.net)
2022-06-22 06:20:45 +0200Natch(~natch@c-9e07225c.038-60-73746f7.bbcust.telenor.se)
2022-06-22 06:20:53 +0200AlexZenon(~alzenon@178.34.160.206)
2022-06-22 06:21:55 +0200chexum(~quassel@gateway/tor-sasl/chexum) (Quit: No Ping reply in 180 seconds.)
2022-06-22 06:23:32 +0200chexum(~quassel@gateway/tor-sasl/chexum)
2022-06-22 06:27:05 +0200_73(~user@pool-108-49-252-36.bstnma.fios.verizon.net)
2022-06-22 06:30:19 +0200echoreply(~echoreply@2001:19f0:9002:1f3b:5400:ff:fe6f:8b8d)
2022-06-22 06:30:30 +0200Guest1698(~Guest1698@20.83.116.49)
2022-06-22 06:31:00 +0200drewr(~drew@user/drewr)
2022-06-22 06:31:10 +0200tv(~tv@user/tv)
2022-06-22 06:31:18 +0200yauhsien(~yauhsien@61-231-23-53.dynamic-ip.hinet.net) (Ping timeout: 264 seconds)
2022-06-22 06:45:54 +0200yrlnry(~yrlnry@pool-108-2-150-109.phlapa.fios.verizon.net)
2022-06-22 06:48:44 +0200yauhsien(~yauhsien@61-231-23-53.dynamic-ip.hinet.net)
2022-06-22 06:50:42 +0200yrlnry(~yrlnry@pool-108-2-150-109.phlapa.fios.verizon.net) (Ping timeout: 268 seconds)
2022-06-22 06:58:10 +0200odnes(~odnes@5-203-249-68.pat.nym.cosmote.net) (Remote host closed the connection)
2022-06-22 07:18:49 +0200mvk(~mvk@2607:fea8:5ce3:8500::4588) (Ping timeout: 248 seconds)
2022-06-22 07:18:58 +0200Kaipei(~Kaiepi@156.34.47.253) (Ping timeout: 240 seconds)
2022-06-22 07:31:05 +0200Teacup(~teacup@user/teacup) (Quit: No Ping reply in 180 seconds.)
2022-06-22 07:32:42 +0200Teacup(~teacup@user/teacup)
2022-06-22 07:41:13 +0200michalz(~michalz@185.246.204.107)
2022-06-22 07:42:40 +0200mjs22(~mjs22@76.115.19.239)
2022-06-22 07:48:14 +0200causal(~user@50.35.83.177)
2022-06-22 07:48:34 +0200takuan(~takuan@178-116-218-225.access.telenet.be)
2022-06-22 07:48:42 +0200mbuf(~Shakthi@122.164.15.152)
2022-06-22 07:50:18 +0200_ht(~quassel@231-169-21-31.ftth.glasoperator.nl)
2022-06-22 07:57:08 +0200jpds1(~jpds@gateway/tor-sasl/jpds)
2022-06-22 08:19:34 +0200acidjnk_new(~acidjnk@dynamic-046-114-169-114.46.114.pool.telefonica.de)
2022-06-22 08:23:44 +0200eggplantade(~Eggplanta@2600:1700:bef1:5e10:9:59a4:5055:fd8c) (Remote host closed the connection)
2022-06-22 08:24:26 +0200mixfix41(~sdenynine@user/mixfix41) (Ping timeout: 268 seconds)
2022-06-22 08:24:27 +0200_ht(~quassel@231-169-21-31.ftth.glasoperator.nl) (Remote host closed the connection)
2022-06-22 08:25:25 +0200HotblackDesiato(~HotblackD@gateway/tor-sasl/hotblackdesiato) (Remote host closed the connection)
2022-06-22 08:25:28 +0200jgeerds(~jgeerds@55d45f48.access.ecotel.net)
2022-06-22 08:25:46 +0200HotblackDesiato(~HotblackD@gateway/tor-sasl/hotblackdesiato)
2022-06-22 08:26:32 +0200vysn(~vysn@user/vysn)
2022-06-22 08:31:57 +0200Sgeo_(~Sgeo@user/sgeo) (Read error: Connection reset by peer)
2022-06-22 08:37:16 +0200dsrt^(~dsrt@50.237.44.186)
2022-06-22 08:42:36 +0200chexum(~quassel@gateway/tor-sasl/chexum) (Remote host closed the connection)
2022-06-22 08:45:01 +0200chexum(~quassel@gateway/tor-sasl/chexum)
2022-06-22 08:50:39 +0200leeb(~leeb@KD106155002239.au-net.ne.jp)
2022-06-22 08:57:28 +0200kimjetwav(~user@2607:fea8:2340:da00:1282:4dfa:aaca:27db) (Remote host closed the connection)
2022-06-22 08:57:53 +0200kimjetwav(~user@2607:fea8:2340:da00:b4b3:9de1:4864:1487)
2022-06-22 08:59:02 +0200BusConscious(~martin@ip5f5bdedc.dynamic.kabel-deutschland.de)
2022-06-22 09:00:15 +0200kimjetwav(~user@2607:fea8:2340:da00:b4b3:9de1:4864:1487) (Remote host closed the connection)
2022-06-22 09:00:40 +0200kimjetwav(~user@2607:fea8:2340:da00:487c:b90f:99a5:bda3)
2022-06-22 09:01:36 +0200 <BusConscious> hello everyone
2022-06-22 09:01:40 +0200 <BusConscious> Could not find module ‘Data.ByteString.UTF8’
2022-06-22 09:02:16 +0200 <BusConscious> What's going on there? I do have bytestring installed both globally and locally as dependency in my cabal
2022-06-22 09:02:16 +0200jonathanx(~jonathan@dyn-5-sc.cdg.chalmers.se)
2022-06-22 09:02:30 +0200 <Axman6> do you have a version of bytestring which has that module installed?
2022-06-22 09:02:30 +0200 <BusConscious> and I can import Data.ByteString
2022-06-22 09:02:45 +0200Infinite(~Infinite@2405:204:5381:d6e2:eefe:bfdb:b3b1:f5f4)
2022-06-22 09:02:50 +0200 <Axman6> https://hackage.haskell.org/package/bytestring doesn't export that module
2022-06-22 09:03:26 +0200 <tomsmeding> BusConscious: utf8-string exports that
2022-06-22 09:03:36 +0200 <tomsmeding> what docs told you it was in 'bytestring'?
2022-06-22 09:04:27 +0200 <BusConscious> I want to convert Strings to ByteStrings back and forth
2022-06-22 09:04:38 +0200 <Axman6> I'm pretty sure it's never been part of bytestring
2022-06-22 09:04:45 +0200 <BusConscious> (I don't want to do that, but I have to)
2022-06-22 09:04:48 +0200 <Axman6> Have you looked at the text package?
2022-06-22 09:04:52 +0200 <tomsmeding> I've been using utf8-string for that, seems to work well enough
2022-06-22 09:04:58 +0200 <tomsmeding> that exports Data.ByteString.UTF8
2022-06-22 09:06:01 +0200lagash(lagash@lagash.shelltalk.net) (Ping timeout: 248 seconds)
2022-06-22 09:08:10 +0200dschrempf(~dominik@070-207.dynamic.dsl.fonira.net)
2022-06-22 09:08:13 +0200 <BusConscious> ok that seems to work
2022-06-22 09:08:30 +0200kimjetwav(~user@2607:fea8:2340:da00:487c:b90f:99a5:bda3) (Ping timeout: 264 seconds)
2022-06-22 09:08:35 +0200 <Axman6> BusConscious: can you tell us more about what you actually want to do? because working with text is usually something we'd do using the text package
2022-06-22 09:08:36 +0200frost(~frost@user/frost) (Quit: Client closed)
2022-06-22 09:12:58 +0200bitdex(~bitdex@gateway/tor-sasl/bitdex) (Remote host closed the connection)
2022-06-22 09:13:57 +0200bitdex(~bitdex@gateway/tor-sasl/bitdex)
2022-06-22 09:14:20 +0200 <BusConscious> Axman6: So I'm trying to write a unix shell and people here have been telling me to not use String to represent filepaths and my string, because there is no requirement in POSIX, that these things should be utf-8 or whatever, which is an argument I can see for sure
2022-06-22 09:14:30 +0200gmg(~user@user/gehmehgeh)
2022-06-22 09:15:18 +0200 <tomsmeding> but you _are_ convering some things to String at some point, apparently
2022-06-22 09:16:05 +0200 <tomsmeding> I guess what Axman6 is getting at is that you can probably replace all your uses of String with Text
2022-06-22 09:16:05 +0200 <BusConscious> yes, because it's so much easier to work with and a lot of functions in the library only accept String
2022-06-22 09:16:15 +0200 <BusConscious> https://hackage.haskell.org/package/unix-2.7.2.2/docs/System-Posix-IO.html
2022-06-22 09:16:32 +0200 <tomsmeding> BusConscious: that sounds like a weird argument. We want to represent non-utf8 things, but then we work with it as strings because that's easier
2022-06-22 09:16:56 +0200 <BusConscious> Even in the POSIX API FilePath is accepted which is a type synonym of String
2022-06-22 09:17:24 +0200 <tomsmeding> right
2022-06-22 09:17:50 +0200elkcl_(~elkcl@broadband-37-110-156-162.ip.moscow.rt.ru)
2022-06-22 09:17:58 +0200elkcl(~elkcl@broadband-37-110-156-162.ip.moscow.rt.ru) (Ping timeout: 240 seconds)
2022-06-22 09:17:58 +0200elkcl_elkcl
2022-06-22 09:18:06 +0200 <tomsmeding> bytestring does have IO functions, but not to FDs https://hackage.haskell.org/package/bytestring-0.11.3.1/docs/Data-ByteString.html#v:hPut
2022-06-22 09:18:41 +0200 <tomsmeding> like, if you're doing the IO in String form, why even use ByteString internally
2022-06-22 09:20:20 +0200 <BusConscious> ok so I should stick to either String or Text? On the other hand I might have to use FFI again and converting to and from a CString may be easier with a ByteString..
2022-06-22 09:20:31 +0200 <BusConscious> or I use text
2022-06-22 09:20:43 +0200 <tomsmeding> if you want to avoid assuming UTF8, you should stick to ByteString :p
2022-06-22 09:20:49 +0200benin0(~benin@183.82.26.120)
2022-06-22 09:20:56 +0200 <tomsmeding> https://hackage.haskell.org/package/base-4.14.0.0/docs/GHC-IO-FD.html has FD I/O functions, though with Ptr
2022-06-22 09:21:11 +0200 <tomsmeding> if you are okay with assuming UTF8, use Text
2022-06-22 09:21:23 +0200tomsmedingis guilty of over-using String as well, but am lazy
2022-06-22 09:21:45 +0200coot(~coot@213.134.190.95)
2022-06-22 09:22:17 +0200 <tomsmeding> (that String argument to the read and write functions is simply a function name used in error messages)
2022-06-22 09:22:34 +0200acidjnk(~acidjnk@dynamic-046-114-169-114.46.114.pool.telefonica.de)
2022-06-22 09:23:19 +0200Everything(~Everythin@37.115.210.35)
2022-06-22 09:24:17 +0200eggplantade(~Eggplanta@2600:1700:bef1:5e10:9:59a4:5055:fd8c)
2022-06-22 09:24:38 +0200mc47(~mc47@xmonad/TheMC47)
2022-06-22 09:25:11 +0200lortabac(~lortabac@2a01:e0a:541:b8f0:2cd:7ecf:235f:1481)
2022-06-22 09:25:18 +0200acidjnk_new(~acidjnk@dynamic-046-114-169-114.46.114.pool.telefonica.de) (Ping timeout: 240 seconds)
2022-06-22 09:26:08 +0200ubert(~Thunderbi@p200300ecdf0da56677798f1bce3bed29.dip0.t-ipconnect.de)
2022-06-22 09:27:16 +0200 <BusConscious> I think I will stick with String for now. It may not be ideal because it assumes UTF8, but having to competing string types is such a pain in the ass. I won't get any joy out of fiddling all these types together.
2022-06-22 09:27:41 +0200 <tomsmeding> if the goal is having fun, then do whatever you want :p
2022-06-22 09:27:51 +0200mixfix41(~sdenynine@user/mixfix41)
2022-06-22 09:28:14 +0200 <tomsmeding> if you were writing production software, I would advise heeding the advice here more
2022-06-22 09:28:17 +0200ccntrq(~Thunderbi@dynamic-077-003-064-244.77.3.pool.telefonica.de)
2022-06-22 09:28:21 +0200 <tomsmeding> but don't let the real world spoil the fun
2022-06-22 09:28:57 +0200eggplantade(~Eggplanta@2600:1700:bef1:5e10:9:59a4:5055:fd8c) (Ping timeout: 248 seconds)
2022-06-22 09:30:05 +0200 <BusConscious> One last Q: What happens if I write something like "\xff" in haskell? How is that represented byte-wise?
2022-06-22 09:30:22 +0200 <Axman6> BusConscious: you've actually come across one of the more complicated situations where it's not immediately clear what you should use - for IO of data, it sounds like ByteString is the way to go, just (for now) assume you don't need to care about encoding; if you're piping strout from one process into another's stdin, just send whatever bytes you get. as for file paths, that is more complex, because Haskell's use of String is not a good choice
2022-06-22 09:31:26 +0200 <tomsmeding> BusConscious: https://tomsmeding.com/ss/get/tomsmeding/SX1k9z
2022-06-22 09:31:41 +0200ccntrq1(~Thunderbi@exit-1.office.han.de.mhd.medondo.com)
2022-06-22 09:31:49 +0200rendar(~rendar@user/rendar) (Ping timeout: 244 seconds)
2022-06-22 09:32:03 +0200 <Axman6> BusConscious: describing them as competing string types isn't really fair, they all have their uses and their own tradeoffs
2022-06-22 09:32:57 +0200ccntrq(~Thunderbi@dynamic-077-003-064-244.77.3.pool.telefonica.de) (Ping timeout: 256 seconds)
2022-06-22 09:32:57 +0200ccntrq1ccntrq
2022-06-22 09:33:29 +0200 <Axman6> "\xff" depends on what type that string looking thing is. if it's a haskell String, then you'll just have ['\xff']. if it's a text (2.0) it'll be the two byte encoding of the codepoint for 255
2022-06-22 09:34:56 +0200 <Axman6> if it's a text < 1.0 Text, then it'll be the UTF-16 string with the codepoint 255 in it
2022-06-22 09:35:29 +0200 <BusConscious> So what is '\xff' then is it a two byte encoding of the codepoint 255 as well?
2022-06-22 09:35:45 +0200 <tomsmeding> '\xff' of type Char?
2022-06-22 09:35:49 +0200 <BusConscious> yes
2022-06-22 09:35:51 +0200 <Axman6> '\xff' is a Char, which represents a unicode codepoint
2022-06-22 09:36:03 +0200 <tomsmeding> Char is just an Int internally
2022-06-22 09:36:11 +0200 <Axman6> @src Char
2022-06-22 09:36:11 +0200 <lambdabot> data Char = C# Char#
2022-06-22 09:36:16 +0200 <Axman6> @src Char#
2022-06-22 09:36:16 +0200 <lambdabot> Source not found. There are some things that I just don't know.
2022-06-22 09:36:19 +0200 <Axman6> :(
2022-06-22 09:36:48 +0200 <tomsmeding> https://hackage.haskell.org/package/ghc-prim-0.8.0/docs/src/GHC.Prim.html#Char%23
2022-06-22 09:36:54 +0200 <Axman6> but ywah, Char# is basically (or actually?) an Int# (or Int32#?)
2022-06-22 09:37:00 +0200Infinite(~Infinite@2405:204:5381:d6e2:eefe:bfdb:b3b1:f5f4) (Ping timeout: 252 seconds)
2022-06-22 09:37:02 +0200 <BusConscious> Unicode sequences can be at most 3 or 4 bytes right, so they fit in a Int32
2022-06-22 09:37:02 +0200 <tomsmeding> seems to be a primitive type
2022-06-22 09:37:05 +0200 <BusConscious> makes sense
2022-06-22 09:37:36 +0200 <Axman6> Chars are not utf-8 sequences, they are codepoints, they're just a number
2022-06-22 09:37:48 +0200Infinite(~Infinite@49.39.123.213)
2022-06-22 09:38:14 +0200 <Axman6> utf-8 is an encoding, which is what text now uses internally (it used to use utf-16, which was the worst of both worlds of utf-8 and utf-32)
2022-06-22 09:38:39 +0200eod|fserucas(~eod|fseru@193.65.114.89.rev.vodafone.pt)
2022-06-22 09:38:43 +0200eod|fserucas_(~eod|fseru@193.65.114.89.rev.vodafone.pt)
2022-06-22 09:39:16 +0200 <Axman6> A Char may be written, when encoded as utf-8, using 1, 2, 3 or 4 bytes, but a Char is always a 32 bit integer (probably, can't confirm from the link above but I believe that's true)
2022-06-22 09:39:19 +0200 <tomsmeding> finally, found the definition https://hackage.haskell.org/package/base-4.16.0.0/docs/src/GHC.Base.html#line-189
2022-06-22 09:39:37 +0200 <tomsmeding> sizeOf on Char returns 4, so presumably
2022-06-22 09:45:54 +0200jgeerds(~jgeerds@55d45f48.access.ecotel.net) (Ping timeout: 276 seconds)
2022-06-22 09:46:18 +0200jonathanx(~jonathan@dyn-5-sc.cdg.chalmers.se) (Ping timeout: 240 seconds)
2022-06-22 09:47:09 +0200Infinite(~Infinite@49.39.123.213) (Quit: Client closed)
2022-06-22 09:47:27 +0200kuribas(~user@ip-188-118-57-242.reverse.destiny.be)
2022-06-22 09:48:59 +0200machinedgod(~machinedg@66.244.246.252)
2022-06-22 09:51:42 +0200dsrt^(~dsrt@50.237.44.186) (Ping timeout: 264 seconds)
2022-06-22 09:51:56 +0200MajorBiscuit(~MajorBisc@wlan-145-94-167-213.wlan.tudelft.nl)
2022-06-22 09:53:21 +0200mjs22(~mjs22@76.115.19.239) (Quit: Leaving)
2022-06-22 09:54:42 +0200progress__(~fffuuuu_i@45.112.243.220)
2022-06-22 09:54:56 +0200raym(~raym@user/raym) (Remote host closed the connection)
2022-06-22 09:58:10 +0200AlexNoo_AlexNoo
2022-06-22 10:03:40 +0200nate4(~nate@98.45.169.16)
2022-06-22 10:06:28 +0200gurkenglas(~gurkengla@dslb-002-207-014-022.002.207.pools.vodafone-ip.de)
2022-06-22 10:08:10 +0200nate4(~nate@98.45.169.16) (Ping timeout: 240 seconds)
2022-06-22 10:10:33 +0200Guest92(~Guest92@2600:1000:b166:c4f9:ac93:f08:cf56:c856)
2022-06-22 10:10:40 +0200Guest92(~Guest92@2600:1000:b166:c4f9:ac93:f08:cf56:c856) ()
2022-06-22 10:11:04 +0200tzh(~tzh@c-24-21-73-154.hsd1.wa.comcast.net) (Quit: zzz)
2022-06-22 10:11:55 +0200 <merijn> Axman6: tbh, probably more :p
2022-06-22 10:11:58 +0200 <merijn> Axman6: Char is boxed
2022-06-22 10:12:59 +0200 <merijn> I like how 20(?) year after it was written I still have to link people to Joel's unicode blog
2022-06-22 10:13:59 +0200 <Maxdamantus> I wonder when there'll be a standard Unicode string type in Haskell.
2022-06-22 10:14:19 +0200 <Maxdamantus> (rather than one that's only limited to well-formed Unicode strings)
2022-06-22 10:16:11 +0200Kaipei(~Kaiepi@156.34.47.253)
2022-06-22 10:16:22 +0200jonathanx(~jonathan@h-178-174-176-109.A357.priv.bahnhof.se)
2022-06-22 10:18:47 +0200 <merijn> Maxdamantus: Please define well-formed unicode string :D
2022-06-22 10:18:52 +0200 <merijn> See you in about a year
2022-06-22 10:19:28 +0200 <merijn> oh, wait, I read that inverted
2022-06-22 10:19:56 +0200 <Maxdamantus> merijn: these things are clearly defined in the Unicode standard. The `Text` library is limited to well-formed Unicode strings, but according to the Unicode standard, Unicode strings are not necessarily well-formed and are specifically allowed to be any sequence of code units (of a particular type).
2022-06-22 10:20:58 +0200 <Maxdamantus> and the string libraries that are backed by the Unicode consortium (eg, ICU and Java `String`s) work according to the standard.
2022-06-22 10:21:25 +0200 <merijn> Maxdamantus: I don't think Text is limited to well-formed unicode, is it?
2022-06-22 10:21:51 +0200 <Maxdamantus> merijn: it certainly was last time I looked at it (when it was opaquely based on UTF-16).
2022-06-22 10:22:21 +0200 <Maxdamantus> aiui they switched from opaque UTF-16 to opaque UTF-8, but that's mostly an implementation detail, since they don't support storing arbitrary code units.
2022-06-22 10:22:26 +0200 <merijn> Maxdamantus: I mean, you can always use ByteString and text-icu for more niche use cases
2022-06-22 10:22:31 +0200cfricke(~cfricke@user/cfricke)
2022-06-22 10:22:38 +0200chomwitt(~chomwitt@2a02:587:dc0d:e600:1174:892d:39e3:5e01)
2022-06-22 10:22:58 +0200 <Maxdamantus> Right, but I meant I was wondering when there'd be a reasonably ubiquitous Unicode string type.
2022-06-22 10:23:18 +0200 <Maxdamantus> It's kind of crap that Unicode is only handled properly if you use `ByteString`s or ICU
2022-06-22 10:23:18 +0200 <merijn> What purpose would that serve/
2022-06-22 10:23:47 +0200 <merijn> I mean, you seem to have a very specific and niche definition of "handled properly" that is not helpful to 99% of code
2022-06-22 10:23:47 +0200 <Maxdamantus> People could write proper APIs involving things like file names.
2022-06-22 10:24:03 +0200 <merijn> Maxdamantus: No, that wouldn't be fixed by that
2022-06-22 10:24:16 +0200 <merijn> Since the fundamental problem is file name APIs being different across platforms
2022-06-22 10:24:41 +0200Neuromancer(~Neuromanc@user/neuromancer)
2022-06-22 10:24:41 +0200 <Maxdamantus> There's a fairly sane way of representing them as UTF-8 strings on all of the common platforms.
2022-06-22 10:25:00 +0200dschrempf(~dominik@070-207.dynamic.dsl.fonira.net) (Quit: WeeChat 3.5)
2022-06-22 10:25:16 +0200eggplantade(~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e)
2022-06-22 10:25:19 +0200 <Maxdamantus> particularly, pass through the bytes as-is on POSIX systems, and convert to WTF-8 on win32.
2022-06-22 10:25:52 +0200 <tomsmeding> how is a sequence of random bytes (not including 0 and '/', okay) even a non-well-formed unicode string
2022-06-22 10:26:01 +0200 <tomsmeding> in what encoding
2022-06-22 10:26:04 +0200 <merijn> Maxdamantus: Windows filenames are explicitly UTF-16 on windows
2022-06-22 10:26:14 +0200 <merijn> Maxdamantus: On linux they're "nothing remotely resembling unicode"
2022-06-22 10:26:43 +0200 <merijn> On macOS they're "UTF-16, except that's no longer enforced by the low level filesystem APIs, only the high level ones, RIP you"
2022-06-22 10:26:44 +0200 <Maxdamantus> tomsmeding: a Unicode string is a sequence of code units. In particular, a UTF-8 Unicode string is a sequence of bytes.
2022-06-22 10:27:04 +0200 <Maxdamantus> I can quote the Unicode standard.
2022-06-22 10:27:11 +0200 <tomsmeding> Maxdamantus: right, and not every linux filename is valid utf8. They _are_ all valid latin1, but then every byte sequence is valid latin1
2022-06-22 10:27:23 +0200 <tomsmeding> but using latin1 encoding for windows filenames makes no sense
2022-06-22 10:27:37 +0200 <Maxdamantus> tomsmeding: assuming by "valid" you mean "well-formed", that's not what I'm talking about.
2022-06-22 10:27:49 +0200 <Maxdamantus> https://www.unicode.org/versions/Unicode14.0.0/ch03.pdf#G7404
2022-06-22 10:27:52 +0200 <merijn> Maxdamantus: No, valid means "it's actually unicode"
2022-06-22 10:28:10 +0200 <Maxdamantus> merijn: no.
2022-06-22 10:28:12 +0200 <Maxdamantus> > D80 Unicode string: A code unit sequence containing code units of a particular Unicode
2022-06-22 10:28:13 +0200 <lambdabot> <hint>:1:64: error: parse error on input ‘of’
2022-06-22 10:28:15 +0200 <Maxdamantus> encoding form.
2022-06-22 10:28:28 +0200 <Maxdamantus> D78 Code unit sequence: An ordered sequence of one or more code units
2022-06-22 10:28:33 +0200 <Maxdamantus> When the code unit is an 8-bit unit, a code unit sequence may also be referred
2022-06-22 10:28:33 +0200 <Maxdamantus> to as a byte sequence.
2022-06-22 10:28:48 +0200 <Maxdamantus> "Unicode string" does *NOT* mean well-formed (or "valid") Unicode.
2022-06-22 10:28:59 +0200 <Maxdamantus> The Unicode standard makes that fairly explicit in various places.
2022-06-22 10:29:10 +0200 <tomsmeding> Maxdamantus: so for my understanding, removing well-formedness from the requirements not only makes incompatible sequences of code points allowed (e.g. modifiers that don't work on particular characters, or unpaired surrogates), but also something that doesn't even decode as individual utf8 code points?
2022-06-22 10:29:13 +0200 <Maxdamantus> if you look up "Unicode string" in the glossary, they will even explicitly say that there.
2022-06-22 10:29:28 +0200 <Maxdamantus> > Unicode String. A code unit sequence containing code units of a particular Unicode encoding form (whether well-formed or not). (See definition D80 in Section 3.9, Unicode Encoding Forms.)
2022-06-22 10:29:29 +0200 <lambdabot> <hint>:1:60: error: parse error on input ‘of’
2022-06-22 10:29:35 +0200eggplantade(~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) (Ping timeout: 255 seconds)
2022-06-22 10:29:38 +0200 <merijn> Maxdamantus: That disagrees what you're saying
2022-06-22 10:29:45 +0200 <merijn> Maxdamantus: "a code unit sequence"
2022-06-22 10:29:59 +0200 <Maxdamantus> merijn: right, that's what I've been saying.
2022-06-22 10:29:59 +0200 <merijn> Maxdamantus: "Code unit: The minimal bit combination that can represent a unit of encoded text"
2022-06-22 10:30:02 +0200 <merijn> for processing or interchange.
2022-06-22 10:30:23 +0200 <merijn> Maxdamantus: I interpret that to mean only include *valid* encodings of unit code codepoints
2022-06-22 10:30:38 +0200 <merijn> not all byte sequences are made up of only valid unicode codepoint encodings
2022-06-22 10:30:50 +0200 <Maxdamantus> merijn: so what would be an example of a Unicode string that is not well-formed?
2022-06-22 10:31:12 +0200 <merijn> Maxdamantus: A well-formed one says certain unicode codepoints can only occur before/after certain types of characters
2022-06-22 10:31:17 +0200 <tomsmeding> Maxdamantus: is [\x80] a non-well-formed unicode string in utf8?
2022-06-22 10:31:23 +0200 <merijn> think of "accent modifiers" like ` ' ^
2022-06-22 10:31:27 +0200 <Maxdamantus> merijn: no. That's not what well-formedness is.
2022-06-22 10:31:39 +0200 <Maxdamantus> merijn: well-formedness does not involve interpretation of code points.
2022-06-22 10:32:02 +0200 <tomsmeding> ah, see D84
2022-06-22 10:32:10 +0200 <Maxdamantus> merijn: well-formed simply means that it represents a sequence of Unicode scalar values.
2022-06-22 10:32:41 +0200 <Maxdamantus> merijn: USVs can even be undefined, and you still have a well-formed Unicode string.
2022-06-22 10:33:04 +0200 <tomsmeding> representing a non-well-formed unicode string then basically means either 1. storing (original bytes, purported encoding), or 2. some ugly sum type with various decode failures as options
2022-06-22 10:33:11 +0200 <merijn> "Ill-formed: A Unicode code unit sequence that purports to be in a Unicode encoding"
2022-06-22 10:33:14 +0200 <merijn> form is called ill-formed if and only if it does not follow the specification of that Unicode encoding form.
2022-06-22 10:33:26 +0200 <tomsmeding> merijn: see the second bullet point under D84 about UTF8
2022-06-22 10:33:30 +0200 <Maxdamantus> Feel free to read the Unicode standard. I'm quite familiar with chapter 3, which defines all of these things.
2022-06-22 10:33:56 +0200 <Maxdamantus> merijn: right, the Unicode forms are "UTF-8", "UTF-16" and "UTF-32".
2022-06-22 10:34:09 +0200 <merijn> Maxdamantus: You suggested utf-8 for linux filenames
2022-06-22 10:34:15 +0200 <Maxdamantus> merijn: those exist independently of any interpretation of actual code points. That happens elsewhere in the standard.
2022-06-22 10:34:22 +0200 <merijn> Maxdamantus: linux filenames are ill-formed per D84
2022-06-22 10:34:42 +0200 <merijn> anyway, meeting
2022-06-22 10:34:49 +0200 <tomsmeding> merijn: hence Maxdamantus is suggesting using a string type that can represent non-well-formed strings
2022-06-22 10:34:54 +0200 <Maxdamantus> merijn: right, I would suggest treating filenames as UTF-8 Unicode strings, just ones that are not necessarily well-formed (aka, not necessarily "in UTF-8")
2022-06-22 10:35:21 +0200 <merijn> Which seems rather inferior to the strictly more correct ByteString representation
2022-06-22 10:35:28 +0200 <tomsmeding> Maxdamantus: how much of a performance penalty do you get from using such an implementation, as compared to one that can assume its internal representation _is_ well-formed
2022-06-22 10:35:39 +0200 <tomsmeding> and what gains do you get :p
2022-06-22 10:36:11 +0200 <tomsmeding> I'd expect that the only time when you want to get the "text-like" data in a linux filename is when you want to show it to the user -- and at that point you can just do a lenient UTF8 decode
2022-06-22 10:36:19 +0200 <Maxdamantus> tomsmeding: it's a negative penalty. You pay a penalty when using restrictively well-formed strings because you need to check that the string is well-formed when reading it.
2022-06-22 10:36:42 +0200 <Maxdamantus> tomsmeding: there's some interesting commentary around that in the documentation for Rust's `bstr` package.
2022-06-22 10:37:07 +0200 <Maxdamantus> where the author gives examples of things like treating a mmapped file as a string.
2022-06-22 10:37:36 +0200 <Maxdamantus> you can't do that if the string library requires the bytes be well-formed, since you'd have to scan through the entire file to check it before allowing it to be used.
2022-06-22 10:37:43 +0200shriekingnoise(~shrieking@201.212.175.181) (Quit: Quit)
2022-06-22 10:37:45 +0200 <Maxdamantus> https://docs.rs/bstr/latest/bstr/
2022-06-22 10:39:06 +0200Henkru(henkru@kapsi.fi) (Ping timeout: 264 seconds)
2022-06-22 10:39:11 +0200 <Maxdamantus> I don't think there would be any significant performance benefits in any cases by restricting to well-formed strings.
2022-06-22 10:39:19 +0200acidjnk(~acidjnk@dynamic-046-114-169-114.46.114.pool.telefonica.de) (Leaving)
2022-06-22 10:39:20 +0200 <kritzefitz> Maxdamantus: regardless of performance, having to expect Texts to contain invalid encodings sounds like a nightmare to me. It's already a common enough pitfall to assume that any input you receive is well encoded, having to catch those failures on almost all Text operations would be far harder to handle than just having to catch failures when decoding.
2022-06-22 10:39:26 +0200 <Maxdamantus> it only has negative performance impacts due to the extra checking that needs to be done at boundaries.
2022-06-22 10:39:35 +0200Henkru(henkru@kapsi.fi)
2022-06-22 10:40:13 +0200 <Maxdamantus> kritzefitz: that shouldn't be necessary.
2022-06-22 10:40:42 +0200 <Maxdamantus> kritzefitz: the only operation that could "fail" would be iterating through code points, and that iterator could transparently emit replacement characters by default.
2022-06-22 10:40:52 +0200 <tomsmeding> though UnnormText.unpack would return an Either
2022-06-22 10:40:58 +0200 <Maxdamantus> (not that iterating through code points is a particularly common operation)
2022-06-22 10:41:05 +0200 <tomsmeding> right
2022-06-22 10:42:27 +0200 <Maxdamantus> What's `UnnormText.unpack`?
2022-06-22 10:42:45 +0200 <Maxdamantus> in general these things shouldn't need to produce errors.
2022-06-22 10:42:47 +0200 <tomsmeding> the unpack :: Text -> [Char] function of this hypothetical haskell library that implements non-well-formed strings
2022-06-22 10:43:08 +0200 <tomsmeding> unless you expect it to round-trip with [Char] -> Text
2022-06-22 10:43:19 +0200 <tomsmeding> that's going to fail if there are encoding errors in the bytestring
2022-06-22 10:44:11 +0200 <tomsmeding> so you'd have unpackStrict :: Text -> Maybe [Char], or unpackStrict' :: Text -> [Either Word8 Char], or unpackLenient :: Text -> [Char] that replaces stuff with U+FFFD
2022-06-22 10:44:18 +0200 <Maxdamantus> Right, I'd probably just expect it to emit replacement characters.
2022-06-22 10:44:46 +0200 <Maxdamantus> since that's the normal thing to do when encountering errors while converting between Unicode representations.
2022-06-22 10:45:14 +0200 <tomsmeding> I'd also want a round-tripping version, or at least one that alerts me that round-tripping isn't going to work
2022-06-22 10:45:28 +0200 <Maxdamantus> eg, I suspect that's what will happen if you open a web browser console and do `document.body.innerHTML = "hello \ud800 world";`
2022-06-22 10:45:49 +0200 <tomsmeding> it's not like we're dealing with the whole zoo of weird unicode encodings where you have 100% chance that _something_ in your text is going to be unrepresentable in one of those encodings
2022-06-22 10:46:06 +0200 <tomsmeding> Maxdamantus: yes, but there we're dealing with UI :p
2022-06-22 10:46:18 +0200 <tomsmeding> but yes, mostly one would use my unpackLenient
2022-06-22 10:46:31 +0200 <tomsmeding> interesting, didn't think of this as an issue before
2022-06-22 10:46:57 +0200 <tomsmeding> purists would say "what even are you thinking, linux filenames are not intended to be utf8 so treat them as bytestrings"
2022-06-22 10:47:06 +0200 <tomsmeding> but practice says "well mostly they're utf8 mostly"
2022-06-22 10:47:19 +0200 <Maxdamantus> Hm, interestingly Firefox actually renders the UTF-16 code unit as an error, and it converts it to a replacement character when doing something like copying to the clipboard. That's quite neat.
2022-06-22 10:47:57 +0200 <kritzefitz> Maxdamantus: What do you expect to gain from a Text representation that allows badly encoded underlying bytestrings? Keeping the original bytes in a ByteString and only decoding when you want to do something that explicitly requires code points seems to me like it gets you the same behavior as you described.
2022-06-22 10:48:22 +0200 <Maxdamantus> After people fix Unicode handling in programming languages, that's my next desire: text applications should be able to render UTF-8 errors, and it should be possible to copy the ill-formed UTF-8 arround without losing information.
2022-06-22 10:48:34 +0200 <tomsmeding> kritzefitz: it would only give you convenience
2022-06-22 10:49:05 +0200 <tomsmeding> Maxdamantus's example of Rust's bstr library explicitly says that most of its functionality you can obtain by piecing together existing code that e.g. does regex stuff on bytestrings
2022-06-22 10:49:27 +0200Henkru(henkru@kapsi.fi) (Ping timeout: 256 seconds)
2022-06-22 10:49:36 +0200 <Maxdamantus> kritzefitz: it means you can use the correct type for things. Things like filenames should be strings, not `ByteStrings`, and they should have convenient handling of Unicode.
2022-06-22 10:50:07 +0200 <Maxdamantus> kritzefitz: at the moment, API designers have to decide whether to use `ByteString` for correctness or `Text` for convenience.
2022-06-22 10:50:25 +0200 <Maxdamantus> kritzefitz: if `Text` handled ill-formed Unicode strings, you'd get both with one type.
2022-06-22 10:50:54 +0200 <kritzefitz> But what convenient handling of unicode do you get? When is a Text actually more convenient if you're not allowed to assume that it contains only valid code points?
2022-06-22 10:51:07 +0200foul_owl(~kerry@23.82.194.107) (Ping timeout: 260 seconds)
2022-06-22 10:51:28 +0200 <Maxdamantus> kritzefitz: if `Text` is not more convenient, then why doesn't everyone just use `ByteString` for things like filenames or user input?
2022-06-22 10:53:24 +0200 <merijn> Maxdamantus: Why should filenames be strings?
2022-06-22 10:53:43 +0200 <tomsmeding> Maxdamantus: suggestion if you pitch this to people: avoid getting the unicode spec out. The argument does not rest on "non-well-formed" being defined by the unicode spec; the argument rests on convenience in software engineering. When you throw specs at people, they throw specs back, and the spec for linux filenames is _not_ that they are unicode
2022-06-22 10:53:49 +0200 <tomsmeding> never mind reality where they mostly are
2022-06-22 10:53:59 +0200 <tomsmeding> (but not always)
2022-06-22 10:54:27 +0200 <merijn> Also, the ability to assume that any Text in your codebase will always remain valid unicode is pretty huge
2022-06-22 10:54:28 +0200 <Maxdamantus> merijn: because there should be a ubiquitous string type for text.
2022-06-22 10:54:29 +0200 <tomsmeding> the unicode spec just gives you precedent for your terminology, which is nice but not essential to the pitch
2022-06-22 10:54:49 +0200 <merijn> Maxdamantus: Says who?
2022-06-22 10:55:10 +0200 <merijn> If anything, I think we need *more* string types and better support for being polymorphic over them
2022-06-22 10:55:31 +0200 <tomsmeding> (current IsString is awful for that)
2022-06-22 10:55:49 +0200 <Maxdamantus> But what's the advantage of having the other string types?
2022-06-22 10:55:53 +0200 <merijn> tomsmeding: I proposed a better interface for IsString and other polymorphic literals
2022-06-22 10:56:03 +0200 <merijn> Maxdamantus: Different types are optimised for different uses
2022-06-22 10:56:17 +0200 <merijn> tomsmeding: That's what led to validated-literals :p
2022-06-22 10:56:25 +0200 <Maxdamantus> merijn: optimised in terms of API convenience, or optimised in terms of performance?
2022-06-22 10:56:32 +0200 <merijn> Both
2022-06-22 10:56:50 +0200 <merijn> Your proposal is less convenient for both for 99% of code
2022-06-22 10:56:59 +0200 <Maxdamantus> I don't think you're going to get better performance by having different string types (in particular, I explained how it results in worse performance)
2022-06-22 10:57:23 +0200 <Maxdamantus> and I don't think you get better API convenience either. As I said, it means that API designers have to pick from various string types.
2022-06-22 10:57:55 +0200 <merijn> Maxdamantus: "i can never trust any string in my entire codebase" is a pretty fucking massive downgrade in API usability, no matter what else you propose
2022-06-22 10:58:05 +0200 <Maxdamantus> which string library do I import again when using library xyz?
2022-06-22 10:58:28 +0200raym(~raym@user/raym)
2022-06-22 10:58:33 +0200 <merijn> Maxdamantus: See aforementioned point of "I'd rather get better solutions for being polymorphic across string types"
2022-06-22 10:58:37 +0200arthurs115(~arthurs11@163.5.10.155)
2022-06-22 10:59:15 +0200 <merijn> because that will be useful for lots of other things too
2022-06-22 10:59:21 +0200 <Maxdamantus> merijn: what do you mean by "trust any string"? Do you trust the string "􏿿"?
2022-06-22 10:59:48 +0200 <merijn> Maxdamantus: "any string I make by combining well-formed Text will be well-formed Text"
2022-06-22 10:59:50 +0200 <Maxdamantus> is "􏿿" more trustable than a string containing ill-formed UTF-8.
2022-06-22 10:59:55 +0200 <Maxdamantus> s/.$/?/
2022-06-22 11:00:10 +0200emliunix(~emliunixm@2001:470:69fc:105::2:12d1) (Quit: You have been kicked for being idle)
2022-06-22 11:00:10 +0200 <merijn> Yes
2022-06-22 11:00:27 +0200 <merijn> Because the former is much more well-behaved
2022-06-22 11:00:29 +0200emliunix(~emliunixm@2001:470:69fc:105::2:12d1)
2022-06-22 11:00:50 +0200 <Maxdamantus> It wouldn't be more well-behaved if there's only one string type.
2022-06-22 11:00:59 +0200 <Maxdamantus> The behaviours only occur when converting between encodings.
2022-06-22 11:01:13 +0200 <Maxdamantus> part of the point is to avoid converting between encodings.
2022-06-22 11:01:29 +0200emliunix(~emliunixm@2001:470:69fc:105::2:12d1) ()
2022-06-22 11:01:31 +0200 <Maxdamantus> UTF-16 is dying out, so that shouldn't be a major concern.
2022-06-22 11:02:02 +0200 <Maxdamantus> most string handling should be taking UTF-8 bytes from a network or filesystem and sending them back to the network or filesystem.
2022-06-22 11:03:16 +0200 <kritzefitz> Maxdamantus: I think I mostly use Text to be able to handle unicode regardless of the underlying encoding and the sense of security merijn mentioned. For the cases that you mention, where you only retrieve UTF-8 from somewhere and only pass it back relatively unmodified, I really don't see why Text would be more convenient than ByteString.
2022-06-22 11:03:24 +0200 <Maxdamantus> there might also be some awkwardness when converting to Haskell `String`s (aka, `[Char]`), but those issues already exist with `Text`
2022-06-22 11:03:48 +0200 <Maxdamantus> eg, "\55296" is a possible `String`, but it can't be converted to `Text`.
2022-06-22 11:04:01 +0200 <merijn> Maxdamantus: ???
2022-06-22 11:04:11 +0200 <merijn> ALL strings are by definition convertible to Text
2022-06-22 11:04:21 +0200 <Maxdamantus> merijn: I'm pretty sure that one isn't.
2022-06-22 11:04:24 +0200 <merijn> Why?
2022-06-22 11:04:28 +0200 <kritzefitz> Also I don't think the assumption that everything is UTF-8 and you don't need to care about other encodings is valid for a general purpose language. There are tons of contexts that need to deal with all kinds of encodings and they're not likely to go away.
2022-06-22 11:04:33 +0200 <merijn> > text "\55296"
2022-06-22 11:04:34 +0200 <lambdabot> mueval-core: <stdout>: hPutChar: invalid argument (invalid character)
2022-06-22 11:04:36 +0200 <BusConscious> merijn: As you say I would be more inclined to use ByteString, if I could use it like a normal string in an overloaded syntax and if things like Text.Parsec.ByteString had the same functionality as say Text.Parsec.String
2022-06-22 11:04:44 +0200foul_owl(~kerry@23.82.194.107)
2022-06-22 11:04:55 +0200 <Maxdamantus> merijn: because it can't be encoded as a well-formed Unicode string.
2022-06-22 11:04:58 +0200 <merijn> > generalCategory '\55296'
2022-06-22 11:04:59 +0200 <lambdabot> Surrogate
2022-06-22 11:05:20 +0200 <merijn> > text "\55296a"
2022-06-22 11:05:22 +0200 <lambdabot> mueval-core: <stdout>: hPutChar: invalid argument (invalid character)
2022-06-22 11:05:24 +0200 <Maxdamantus> merijn: none of the Unicode encoding forms allow that code point to be encoded (it is not a Unicode scalar value).
2022-06-22 11:06:48 +0200chele(~chele@user/chele)
2022-06-22 11:07:15 +0200 <kritzefitz> merijn: Apparently `Data.Text.pack "\55296"` has the same result as `Data.Text.pack "\65533"`.
2022-06-22 11:07:40 +0200 <Maxdamantus> "\65533" is U+FFFD, ie, the replacement character.
2022-06-22 11:08:04 +0200 <Maxdamantus> so that means that `pack` is emitting replacement characters on error, which is the behaviour I said is reasonable earlier.
2022-06-22 11:10:02 +0200benin02(~benin@183.82.26.120)
2022-06-22 11:12:10 +0200benin0(~benin@183.82.26.120) (Ping timeout: 268 seconds)
2022-06-22 11:12:10 +0200benin02benin0
2022-06-22 11:12:28 +0200 <kritzefitz> Maxdamantus: From what you said, it seems to me like we would need a new type MaybeInvalidText that preserves it's original encoding, while mostly acting like a Text. And I'm not trying to be argumentative here, but I really don't see when I gain from using it over a ByteString and I also didn't find your previous comments on that very enlightening. Can you give some example when it would do something for you ByteString can't?
2022-06-22 11:12:30 +0200jmdaemon(~jmdaemon@user/jmdaemon) (Quit: ZNC 1.8.2 - https://znc.in)
2022-06-22 11:13:47 +0200 <Maxdamantus> kritzefitz: it's useful because it can become the defacto string type. I suspect there are various APIs in Haskell that treat filenames as either `ByteString`, `String` or `Text`.
2022-06-22 11:14:04 +0200 <Maxdamantus> kritzefitz: preferably there would only be one type for representing filenames.
2022-06-22 11:14:35 +0200 <Maxdamantus> kritzefitz: and preferably there should be no cost (development or performance-wise) when using such strings for different purposes.
2022-06-22 11:14:37 +0200 <kritzefitz> Ah, ok. I guess then I don't follow, because I just don't agree on the premise, that there needs to be one defacto string type.
2022-06-22 11:15:51 +0200 <Maxdamantus> If I want to write a program that scans a directory and prints the filenames to standard out, I shouldn't have to convert from the `FileName` string type to the `PuttableString` string type.
2022-06-22 11:16:19 +0200 <Maxdamantus> filenames are strings, and I can print strings to standard out.
2022-06-22 11:16:56 +0200 <Maxdamantus> the "hello world" program shouldn't be converting to `ByteString` just because standard out is technically a binary file.p
2022-06-22 11:17:34 +0200ubert(~Thunderbi@p200300ecdf0da56677798f1bce3bed29.dip0.t-ipconnect.de) (Remote host closed the connection)
2022-06-22 11:17:52 +0200ubert(~Thunderbi@p200300ecdf0da56600626fc30d47cd25.dip0.t-ipconnect.de)
2022-06-22 11:17:58 +0200progress__(~fffuuuu_i@45.112.243.220) (Quit: Leaving)
2022-06-22 11:20:10 +0200 <kritzefitz> I really don't see why you're so afraid of converting between things. Explicit conversion gives you the ability to actually specifiy things like what to do one errors. Having a one-size-fits-all typically leads to wrong behavior that you have no influence over.
2022-06-22 11:20:58 +0200 <kritzefitz> And forcing people to actually mention the conversion explicitly forces them to think about what they're actually trying to do. Not having to think about the conversion will often mean being later surprised when it doesn't do what you intended.
2022-06-22 11:21:22 +0200 <Maxdamantus> because conversion is not always possible. either information is lost (errors replaced by replacement characters), or errors are raised.
2022-06-22 11:22:07 +0200 <Maxdamantus> if there's no fear in converting between things, there wouldn't be an API in Haskell that treats filenames as `ByteString`.
2022-06-22 11:23:09 +0200 <Maxdamantus> https://hackage.haskell.org/package/unix-2.7.2.2/docs/System-Posix-ByteString.html#t:RawFilePath
2022-06-22 11:23:45 +0200 <Maxdamantus> If there's no problem with conversion, they'd just have `type RawFilePath = Text` or `type RawFilePath = String`.
2022-06-22 11:23:55 +0200 <sm> tuning in late, I'm sure it was already said, but filenames aren't strings
2022-06-22 11:24:15 +0200 <kritzefitz> Using error replacements or raising errors is IMO a good thing. If you don't want that, you probably don't want Text. If you need a Text, you have to deal with the errors in some way anyway.
2022-06-22 11:24:25 +0200 <Maxdamantus> They're not Haskell strings at least, but they are Unicode strings (aka, bytestrings in the case of UTF-8).
2022-06-22 11:24:53 +0200sminvokes maerwald
2022-06-22 11:24:59 +0200 <Maxdamantus> sm: (earlier on I pointed out that standard Unicode strings are not necessarily well-formed, and that the standard describes UTF-8 as effectively equivalent to "bytestrings")
2022-06-22 11:25:43 +0200 <Maxdamantus> anyway, it's also getting kind of late for me, need to do other stuff this evening.
2022-06-22 11:26:11 +0200 <sm> very well, carry on!  👍🏻
2022-06-22 11:28:38 +0200 <merijn> Maxdamantus: Unix package has lots of questionable API design regardless, tbh
2022-06-22 11:32:41 +0200jgeerds(~jgeerds@55d45f48.access.ecotel.net)
2022-06-22 11:40:22 +0200justromeon(~justromeo@120.29.68.81)
2022-06-22 11:41:21 +0200justromeon(~justromeo@120.29.68.81) (Client Quit)
2022-06-22 11:41:45 +0200justromeon(~justromeo@120.29.68.81)
2022-06-22 11:44:06 +0200yauhsien(~yauhsien@61-231-23-53.dynamic-ip.hinet.net) (Remote host closed the connection)
2022-06-22 11:44:29 +0200justromeon(~justromeo@120.29.68.81) (Client Quit)
2022-06-22 11:45:04 +0200yauhsien(~yauhsien@61-231-23-53.dynamic-ip.hinet.net)
2022-06-22 11:48:33 +0200justromeon(~justromeo@120.29.68.81)
2022-06-22 11:49:50 +0200justromeon(~justromeo@120.29.68.81) (Client Quit)
2022-06-22 11:49:54 +0200yauhsien(~yauhsien@61-231-23-53.dynamic-ip.hinet.net) (Ping timeout: 264 seconds)
2022-06-22 11:55:02 +0200ccntrq1(~Thunderbi@exit-1.office.han.de.mhd.medondo.com)
2022-06-22 11:55:05 +0200ccntrq(~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Ping timeout: 248 seconds)
2022-06-22 11:55:05 +0200ccntrq1ccntrq
2022-06-22 11:58:50 +0200lisbeths(uid135845@id-135845.lymington.irccloud.com)
2022-06-22 11:58:51 +0200ccntrq(~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Remote host closed the connection)
2022-06-22 11:59:05 +0200ccntrq(~Thunderbi@exit-1.office.han.de.mhd.medondo.com)
2022-06-22 12:02:34 +0200yauhsien(~yauhsien@61-231-23-53.dynamic-ip.hinet.net)
2022-06-22 12:03:27 +0200arthurs115(~arthurs11@163.5.10.155) (Remote host closed the connection)
2022-06-22 12:04:27 +0200alp__(~alp@user/alp)
2022-06-22 12:05:04 +0200ccntrq(~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Remote host closed the connection)
2022-06-22 12:05:49 +0200ccntrq(~Thunderbi@exit-1.office.han.de.mhd.medondo.com)
2022-06-22 12:06:55 +0200adanwan(~adanwan@gateway/tor-sasl/adanwan) (Remote host closed the connection)
2022-06-22 12:07:25 +0200adanwan(~adanwan@gateway/tor-sasl/adanwan)
2022-06-22 12:08:22 +0200raym(~raym@user/raym) (Ping timeout: 244 seconds)
2022-06-22 12:08:33 +0200ccntrq(~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Read error: Connection reset by peer)
2022-06-22 12:08:45 +0200ccntrq(~Thunderbi@exit-1.office.han.de.mhd.medondo.com)
2022-06-22 12:09:33 +0200kristjansson(sid126207@tinside.irccloud.com) (Ping timeout: 276 seconds)
2022-06-22 12:10:23 +0200raym(~raym@user/raym)
2022-06-22 12:11:05 +0200xff0x(~xff0x@125x103x176x34.ap125.ftth.ucom.ne.jp) (Ping timeout: 248 seconds)
2022-06-22 12:12:12 +0200kristjansson(sid126207@id-126207.tinside.irccloud.com)
2022-06-22 12:13:53 +0200cfricke(~cfricke@user/cfricke) (Ping timeout: 256 seconds)
2022-06-22 12:15:08 +0200ccntrq1(~Thunderbi@exit-1.office.han.de.mhd.medondo.com)
2022-06-22 12:15:16 +0200ccntrq(~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Remote host closed the connection)
2022-06-22 12:15:18 +0200ccntrq1ccntrq
2022-06-22 12:15:38 +0200Surobaki(~surobaki@137.44.222.80)
2022-06-22 12:18:13 +0200econo(uid147250@user/econo) (Quit: Connection closed for inactivity)
2022-06-22 12:20:18 +0200justromeon(~justromeo@120.29.68.81)
2022-06-22 12:20:48 +0200justromeon(~justromeo@120.29.68.81) (Client Quit)
2022-06-22 12:21:07 +0200justromeon(~justromeo@120.29.68.81)
2022-06-22 12:21:18 +0200justromeon(~justromeo@120.29.68.81) (Client Quit)
2022-06-22 12:21:25 +0200xnorfzt(~xnorfzt@2a02:908:d88:320:b5c0:b85f:3ec0:5838)
2022-06-22 12:21:41 +0200justromeon(~justromeo@120.29.68.81)
2022-06-22 12:22:28 +0200ccntrq(~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Ping timeout: 268 seconds)
2022-06-22 12:24:10 +0200 <xnorfzt> Hi all! I'm trying to convert a number of seconds to the difference in hours, minutes and seconds as a `(Int, Int, Int)`. I can do the math by myself using divMod, but is there an ultra-readable way using the time library? I found out that I can create `DiffTime` values with `fromIntegral`, but how do I access the resulting single components
2022-06-22 12:24:10 +0200 <xnorfzt> without `format`ting the time difference?
2022-06-22 12:24:50 +0200 <xnorfzt> Whoops - sorry for the broken code markup. So used to markdown...
2022-06-22 12:25:05 +0200justromeon(~justromeo@120.29.68.81) (Client Quit)
2022-06-22 12:26:09 +0200ccntrq(~Thunderbi@exit-1.office.han.de.mhd.medondo.com)
2022-06-22 12:26:56 +0200cfricke(~cfricke@user/cfricke)
2022-06-22 12:27:35 +0200justromeon(~justromeo@120.29.68.81)
2022-06-22 12:28:02 +0200justromeon(~justromeo@120.29.68.81) (Client Quit)
2022-06-22 12:28:57 +0200BusConscious(~martin@ip5f5bdedc.dynamic.kabel-deutschland.de) (Remote host closed the connection)
2022-06-22 12:29:13 +0200raym(~raym@user/raym) (Ping timeout: 248 seconds)
2022-06-22 12:30:42 +0200ccntrq(~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Ping timeout: 264 seconds)
2022-06-22 12:30:42 +0200ccntrq1(~Thunderbi@exit-1.office.han.de.mhd.medondo.com)
2022-06-22 12:30:59 +0200raym(~raym@user/raym)
2022-06-22 12:32:09 +0200zaquest(~notzaques@5.130.79.72) (Remote host closed the connection)
2022-06-22 12:33:11 +0200ccntrq1ccntrq
2022-06-22 12:35:40 +0200Midjak(~Midjak@82.66.147.146)
2022-06-22 12:36:02 +0200fnurglewitz(uid263868@id-263868.lymington.irccloud.com)
2022-06-22 12:37:37 +0200chexum_(~quassel@gateway/tor-sasl/chexum)
2022-06-22 12:39:50 +0200chexum(~quassel@gateway/tor-sasl/chexum) (Remote host closed the connection)
2022-06-22 12:40:46 +0200ccntrq(~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Remote host closed the connection)
2022-06-22 12:41:03 +0200ccntrq(~Thunderbi@exit-1.office.han.de.mhd.medondo.com)
2022-06-22 12:42:50 +0200chexum_(~quassel@gateway/tor-sasl/chexum) (Ping timeout: 268 seconds)
2022-06-22 12:46:47 +0200chexum(~quassel@gateway/tor-sasl/chexum)
2022-06-22 12:48:23 +0200azimut(~azimut@gateway/tor-sasl/azimut) (Ping timeout: 268 seconds)
2022-06-22 12:48:57 +0200merijn(~merijn@c-001-001-018.client.esciencecenter.eduvpn.nl) (Ping timeout: 248 seconds)
2022-06-22 12:51:07 +0200zaquest(~notzaques@5.130.79.72)
2022-06-22 12:52:50 +0200azimut(~azimut@gateway/tor-sasl/azimut)
2022-06-22 12:58:14 +0200 <tomsmeding> to be honest if you want the most _readable_ option, I vote for \n -> (n `div` 3600, n `div` 60 `mod` 60, n `mod` 60)
2022-06-22 12:58:28 +0200chomwitt(~chomwitt@2a02:587:dc0d:e600:1174:892d:39e3:5e01) (Quit: Leaving)
2022-06-22 12:58:33 +0200 <tomsmeding> since 'time' doesn't seem to have a dedicated function for this
2022-06-22 12:59:09 +0200Henkru(henkru@kapsi.fi)
2022-06-22 13:00:56 +0200ccntrq(~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Remote host closed the connection)
2022-06-22 13:01:19 +0200 <int-e> > (\case ts | (tm, s) <- ts `divMod` 60, (th, m) <- tm `divMod` 60 -> (th,m,s)) 4242 -- scnr
2022-06-22 13:01:21 +0200 <lambdabot> (1,10,42)
2022-06-22 13:01:51 +0200ccntrq(~Thunderbi@exit-1.office.han.de.mhd.medondo.com)
2022-06-22 13:02:41 +0200xff0x(~xff0x@b133147.ppp.asahi-net.or.jp)
2022-06-22 13:04:00 +0200 <tomsmeding> int-e: why a lambdacase instead of \ts -> let (tm, s) = ts `divMod` 60 ; ...
2022-06-22 13:04:00 +0200xnorfztthinks about time-lens :D
2022-06-22 13:04:16 +0200ccntrq(~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Remote host closed the connection)
2022-06-22 13:04:32 +0200 <tomsmeding> nice variable names, though
2022-06-22 13:04:49 +0200 <int-e> tomsmeding: because then I wouldn't get to (ab)use pattern guards
2022-06-22 13:05:03 +0200 <tomsmeding> why are pattern guards better than simple let clauses in this case :p
2022-06-22 13:05:04 +0200 <int-e> I wanted to call them all t.
2022-06-22 13:05:12 +0200 <tomsmeding> right
2022-06-22 13:05:19 +0200 <xnorfzt> tomsmeding int-e - makes sense, it's pretty short and readable, but not what I'm looking for. <3
2022-06-22 13:05:56 +0200ccntrq(~Thunderbi@exit-1.office.han.de.mhd.medondo.com)
2022-06-22 13:07:46 +0200lyle(~lyle@104.246.145.85)
2022-06-22 13:08:43 +0200yauhsien(~yauhsien@61-231-23-53.dynamic-ip.hinet.net) (Remote host closed the connection)
2022-06-22 13:09:18 +0200yauhsien(~yauhsien@61-231-23-53.dynamic-ip.hinet.net)
2022-06-22 13:10:51 +0200xnorfzt(~xnorfzt@2a02:908:d88:320:b5c0:b85f:3ec0:5838) (Quit: xnorfzt)
2022-06-22 13:12:30 +0200fryguybob(~fryguybob@cpe-74-67-169-145.rochester.res.rr.com) (Quit: leaving)
2022-06-22 13:12:37 +0200sympt(~sympt@user/sympt) (Read error: Connection reset by peer)
2022-06-22 13:13:45 +0200sympt(~sympt@user/sympt)
2022-06-22 13:13:57 +0200yauhsien(~yauhsien@61-231-23-53.dynamic-ip.hinet.net) (Ping timeout: 256 seconds)
2022-06-22 13:15:02 +0200merijn(~merijn@c-001-001-018.client.esciencecenter.eduvpn.nl)
2022-06-22 13:15:47 +0200yauhsien(~yauhsien@61-231-23-53.dynamic-ip.hinet.net)
2022-06-22 13:24:11 +0200geekosaur(~geekosaur@xmonad/geekosaur) (Read error: Connection reset by peer)
2022-06-22 13:24:18 +0200allbery_b(~geekosaur@xmonad/geekosaur)
2022-06-22 13:24:21 +0200allbery_bgeekosaur
2022-06-22 13:27:46 +0200Surobaki(~surobaki@137.44.222.80) (Quit: Leaving)
2022-06-22 13:28:12 +0200odnes(~odnes@5-203-220-108.pat.nym.cosmote.net)
2022-06-22 13:30:23 +0200Surobaki(~surobaki@137.44.222.80)
2022-06-22 13:30:41 +0200coot(~coot@213.134.190.95) (Quit: coot)
2022-06-22 13:34:16 +0200coot(~coot@213.134.190.95)
2022-06-22 13:35:36 +0200dschrempf(~dominik@070-207.dynamic.dsl.fonira.net)
2022-06-22 13:39:01 +0200coot(~coot@213.134.190.95) (Client Quit)
2022-06-22 13:40:07 +0200 <maerwald> Maxdamantus: https://github.com/haskellfoundation/tech-proposals/issues/35
2022-06-22 13:42:26 +0200rendar(~Paxman@user/rendar)
2022-06-22 13:42:39 +0200haritzondo(~hrtz@82-69-11-11.dsl.in-addr.zen.co.uk) (Changing host)
2022-06-22 13:42:39 +0200haritzondo(~hrtz@user/haritz)
2022-06-22 13:42:48 +0200haritzondoharitz
2022-06-22 13:48:31 +0200geekosaur(~geekosaur@xmonad/geekosaur) (Ping timeout: 256 seconds)
2022-06-22 13:49:29 +0200geekosaur(~geekosaur@xmonad/geekosaur)
2022-06-22 13:50:02 +0200benin0(~benin@183.82.26.120) (Quit: The Lounge - https://thelounge.chat)
2022-06-22 13:50:35 +0200 <Maxdamantus> maerwald: hm, seems a lot more complicated/tedious than just using a string type capable of handling any byte sequence, where WTF-8 would be used for handling filenames.
2022-06-22 13:51:01 +0200 <maerwald> Maxdamantus: I thought about using WTF-8, but I don't like it
2022-06-22 13:51:22 +0200 <maerwald> I'm not sure you can easily reconstruct underlying encoding information from WTF-8... it would be complicated
2022-06-22 13:51:23 +0200 <Maxdamantus> and yeah, the filenames thing is just an obvious example. The same thing applies to simply reading text from files.
2022-06-22 13:51:40 +0200 <maerwald> the idea is to stop messing with the data that syscalls return
2022-06-22 13:52:05 +0200dsrt^(~dsrt@50.237.44.186)
2022-06-22 13:53:37 +0200 <Maxdamantus> (reading text from files should be a simpler problem because files are at least still [Word8] on Windows, rather than [Word16])
2022-06-22 13:54:27 +0200 <Maxdamantus> imo Windows' use of UTF-16 shouldn't be a reason to complicate the API for other platforms.
2022-06-22 13:55:03 +0200 <Maxdamantus> WTF-8 is slightly ugly, but it's only used to address an ugly API that might be obsolete soon anyway.
2022-06-22 13:55:08 +0200 <maerwald> Maxdamantus: how is the API more complicated? These details are hidden behind a newtype
2022-06-22 13:55:53 +0200 <Maxdamantus> maerwald: how do you print a filename to standard out?
2022-06-22 13:55:56 +0200raehik(~raehik@cpc95906-rdng25-2-0-cust156.15-3.cable.virginm.net)
2022-06-22 13:56:12 +0200 <maerwald> putStr filepath
2022-06-22 13:56:12 +0200 <Maxdamantus> pcesumably involves a conversion.
2022-06-22 13:57:01 +0200 <hpc> you have to pick an encoding anyway on linux, since it's not specified
2022-06-22 13:57:12 +0200jgeerds(~jgeerds@55d45f48.access.ecotel.net) (Ping timeout: 248 seconds)
2022-06-22 13:57:49 +0200 <Maxdamantus> maerwald: so filepath is still a `String`?
2022-06-22 13:57:53 +0200 <maerwald> Maxdamantus: no
2022-06-22 13:58:27 +0200 <Maxdamantus> so `putStr` is made polymorphic over things that are like strings?
2022-06-22 13:58:50 +0200 <merijn> I wonder if this discussion will answer my questions vis-a-vis unstoppable forces and immovable objects :)
2022-06-22 13:58:59 +0200 <hpc> it's not making the api more complicated, it's making it more accurate
2022-06-22 13:59:02 +0200 <maerwald> Maxdamantus: sorry, I meant `print`
2022-06-22 13:59:14 +0200 <hpc> the current api is simple in the same way javascript is simple
2022-06-22 14:00:06 +0200 <hpc> right now, (putStr filepath) is complicated to the programmer in ridiculous ways
2022-06-22 14:00:14 +0200 <Maxdamantus> hpc: JavaScript's API is simple and accurate as long as you only deal with Windows filename APIs.
2022-06-22 14:00:24 +0200 <hpc> on windows it just works, because somewhere there was magic to convert from utf-16
2022-06-22 14:00:35 +0200 <hpc> on linux you have to hope and pray, because filenames are just bytes
2022-06-22 14:01:02 +0200 <maerwald> hpc: well, on windows, you also may have invalid UTF-16
2022-06-22 14:01:10 +0200 <maerwald> the encoding is in fact UCS-2
2022-06-22 14:01:19 +0200 <hpc> ugh, right
2022-06-22 14:01:19 +0200 <maerwald> so you can have invalid surrogate pairs
2022-06-22 14:02:00 +0200 <Maxdamantus> The fact that JS and windows are based around 16-bit strings is just a historical oddity. Going forward, the preference should be 8-bit strings, and we can use WTF-8 for backwards compatibility. Don't need to complicate the APIs for backwards-compatibility.
2022-06-22 14:02:11 +0200yrlnry(~yrlnry@pool-108-2-150-109.phlapa.fios.verizon.net)
2022-06-22 14:02:38 +0200 <merijn> maerwald: Pretty sure it's proper utf-16 now?
2022-06-22 14:02:49 +0200 <merijn> not 100% sure though
2022-06-22 14:02:53 +0200 <Maxdamantus> merijn: pretty sure it's not.
2022-06-22 14:02:57 +0200 <maerwald> merijn: no, you can easily create filepaths via the system API that are not UTF-16
2022-06-22 14:03:11 +0200 <Maxdamantus> unless you're talking about Windows 11. Haven't tested that personally.
2022-06-22 14:03:37 +0200 <Maxdamantus> (I've certainly experimented with this stuff on Windows 10)
2022-06-22 14:04:16 +0200 <hpc> wtf-8 doesn't solve the fact that linux filenames are bytes either
2022-06-22 14:04:29 +0200 <maerwald> hpc: but it never fails, right?
2022-06-22 14:04:54 +0200 <maerwald> haven't tested how it behaves in detail
2022-06-22 14:05:10 +0200nate4(~nate@98.45.169.16)
2022-06-22 14:05:11 +0200 <maerwald> current conversion in base can fail for encoding that are not superset of ascii
2022-06-22 14:05:37 +0200 <Maxdamantus> hpc: WTF-8 is just there to solve the Windows problem. There is no problem with translating paths to byte strings on Linux.
2022-06-22 14:05:37 +0200 <maerwald> such as some korean encodings afair
2022-06-22 14:05:57 +0200 <maerwald> Maxdamantus: there is, because roundtripping isn't always defined
2022-06-22 14:06:25 +0200 <maerwald> see https://hackage.haskell.org/package/base-4.16.1.0/docs/GHC-IO-Encoding.html#v:mkTextEncoding
2022-06-22 14:07:05 +0200 <maerwald> the other issue is that most APIs assume that the filepaths you're consuming correspond to the current locale... which is... uhm, dumb
2022-06-22 14:07:32 +0200 <Maxdamantus> maerwald: by "path" I mean as used by the OS, not as used by current Haskell.
2022-06-22 14:07:57 +0200 <maerwald> Maxdamantus: I don't understand that statement then
2022-06-22 14:08:00 +0200 <Maxdamantus> Haskell's APIs inaccurately represent paths as `String`.
2022-06-22 14:09:05 +0200 <maerwald> system APIs don't interpret encoding... things like path separators '/' are defined accurately (byte in the ascii set) and can be scanned for regardless of the actualy filename encoding
2022-06-22 14:09:19 +0200 <Maxdamantus> maerwald: paths in Linux are already just sequences of bytes, so if a programming language defined a string as a sequence of bytes, there's a no-op mapping between paths and strings.
2022-06-22 14:10:01 +0200nate4(~nate@98.45.169.16) (Ping timeout: 248 seconds)
2022-06-22 14:10:13 +0200 <maerwald> Maxdamantus: yes
2022-06-22 14:10:19 +0200 <maerwald> that's what the new API does
2022-06-22 14:10:42 +0200 <maerwald> there's literally no encoding/decoding
2022-06-22 14:10:49 +0200 <maerwald> unless you want to get a Haskell String
2022-06-22 14:12:39 +0200 <Maxdamantus> maerwald: but it's complicated because it introduces a new string type, which only exists because of a Windows API that might be being obsoleted.
2022-06-22 14:12:53 +0200 <maerwald> Maxdamantus: no, it uses an exsiting string type
2022-06-22 14:13:01 +0200 <maerwald> ShortByteString
2022-06-22 14:13:02 +0200 <Maxdamantus> Which one?
2022-06-22 14:13:09 +0200 <Maxdamantus> Hm.
2022-06-22 14:14:02 +0200 <Maxdamantus> So do you get different `ShortString` values depending on whether the path was read on Windows/Linux?
2022-06-22 14:14:25 +0200 <maerwald> yes, on windows you will have UCS-2LE bytestrings that contain \NUL bytes
2022-06-22 14:14:27 +0200 <Maxdamantus> eg, for a filename that looks like "àéíóú"?
2022-06-22 14:15:20 +0200 <Maxdamantus> and what, the `Show` instance converts differently depending on the OS?
2022-06-22 14:15:41 +0200 <maerwald> Maxdamantus: yes... there's some tradeoff for the Show instance, because we have to convert to String
2022-06-22 14:16:02 +0200 <maerwald> you can't define a total function that doesn't lose information and converts to String
2022-06-22 14:16:18 +0200dsrt^(~dsrt@50.237.44.186) (Ping timeout: 264 seconds)
2022-06-22 14:17:36 +0200 <Maxdamantus> Sure, so there should be a de facto string type capable of handling practically all Unicode strings (except UTF-32 ones)
2022-06-22 14:18:17 +0200 <Maxdamantus> that type would be equivalent ho `ByteString`, where WTF-8 is used for possibly ill-formed UTF-16.
2022-06-22 14:19:00 +0200 <maerwald> the cool thing with this approach compared to WTF-8 is that you could easily use something like this https://hackage.haskell.org/package/charsetdetect-ae-1.1.0.4/docs/Codec-Text-Detect.html on the raw bytes
2022-06-22 14:19:05 +0200 <maerwald> because we're not changing anything
2022-06-22 14:19:44 +0200 <maerwald> Maxdamantus: I'm open to suggestions on how to handle the Show instances
2022-06-22 14:20:21 +0200 <Maxdamantus> I'm not sure you can use that on a `ShornString`
2022-06-22 14:20:39 +0200 <maerwald> https://hackage.haskell.org/package/filepath-2.0.0.3/candidate/docs/src/System.OsString.Internal.T…
2022-06-22 14:20:42 +0200 <Maxdamantus> Unless it has surrogate code units in it, it could always be UTF-16.
2022-06-22 14:21:41 +0200 <Maxdamantus> ie, if you read a filename into a `ShortString`, how does a detector know it's not UTF-16?
2022-06-22 14:22:11 +0200 <maerwald> one way is to just convert Word8 to Char, but then you get garbled crap for most things... and the Show instance isn't really for serialization
2022-06-22 14:22:38 +0200 <maerwald> Maxdamantus: see here https://hackage.haskell.org/package/filepath-2.0.0.3/candidate/docs/System-AbstractFilePath.html#g:3
2022-06-22 14:22:44 +0200 <maerwald> there are 3 functions for conversion
2022-06-22 14:23:14 +0200 <maerwald> one that assumes Utf-8/UTF-16, one that allows to specify the encoding and one that looks up the filesystem encoding... all of them can fail
2022-06-22 14:23:28 +0200waleee(~waleee@2001:9b0:213:7200:cc36:a556:b1e8:b340)
2022-06-22 14:24:34 +0200jao(~jao@cpc103048-sgyl39-2-0-cust502.18-2.cable.virginm.net)
2022-06-22 14:29:30 +0200 <Maxdamantus> Hm, so it's dependent on the OS.
2022-06-22 14:29:48 +0200 <Maxdamantus> What happens when Windows starts offering bytestring-based filenames?
2022-06-22 14:29:50 +0200 <maerwald> Maxdamantus: well, you could specify WTF-8 for both platforms
2022-06-22 14:30:14 +0200 <maerwald> toAbstractFilePath wtf8 wtf8 fp
2022-06-22 14:30:27 +0200 <maerwald> *toAbstractFilePathEnc
2022-06-22 14:30:54 +0200 <maerwald> Maxdamantus: what do you mean?
2022-06-22 14:31:20 +0200 <maerwald> filepaths on windows are already 'wchar_t*'
2022-06-22 14:31:30 +0200alp_(~alp@user/alp)
2022-06-22 14:31:42 +0200 <Maxdamantus> maerwald: if Windows in the future deprecates use of its 16-bit APIs and offers 8-bit APIs instead, where old filenames are transparently converted to WTF-8.
2022-06-22 14:31:52 +0200 <maerwald> Maxdamantus: it will not deprecate that ever
2022-06-22 14:32:01 +0200 <maerwald> windows cares about backwards compat
2022-06-22 14:32:08 +0200 <Maxdamantus> WTF-8 is backwards-compatible.
2022-06-22 14:32:16 +0200 <maerwald> I'm talking about windows
2022-06-22 14:32:20 +0200 <Maxdamantus> So am I.
2022-06-22 14:32:22 +0200 <maerwald> WTF-8 is a rust specific thing
2022-06-22 14:32:27 +0200 <maerwald> has nothing to do with windows
2022-06-22 14:32:41 +0200 <Maxdamantus> Windows could adopt it as part of a migration strategy to 8-bit filenames.
2022-06-22 14:32:45 +0200 <maerwald> windows will contain to provide wide character versions of their system API
2022-06-22 14:32:59 +0200 <maerwald> see https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilew
2022-06-22 14:33:10 +0200 <maerwald> CreateFileW stands for *wide character*
2022-06-22 14:33:15 +0200 <maerwald> it will not change its semantics
2022-06-22 14:33:57 +0200alp__(~alp@user/alp) (Read error: Connection reset by peer)
2022-06-22 14:34:57 +0200 <Maxdamantus> Their current filesystems specifically support 16-bit strings, but it seems plausible that they might move away from that and just use 8-bit strings (cf. macOS). The new APIs could support old NTFS/FAT32 filenames still by transparently converting to WTF-8 (or something equivalent, but at this point there's no point in reinventing WTF-8).
2022-06-22 14:35:08 +0200 <maerwald> no, it doesn't seem plausible
2022-06-22 14:35:16 +0200[itchyjunk](~itchyjunk@user/itchyjunk/x-7353470)
2022-06-22 14:35:23 +0200 <maerwald> windows doesn't randomly break API
2022-06-22 14:35:36 +0200 <maerwald> that's why they still haven't migrated to UTF-16, but still support UCS-2
2022-06-22 14:35:44 +0200 <maerwald> after decades
2022-06-22 14:35:49 +0200 <Maxdamantus> What API is being broken?
2022-06-22 14:38:31 +0200 <maerwald> I don't understand what you're suggesting then. Wide character API works for all existing versions of windows. There's already an ANSI API that allows to configure stuff for UTF-8: https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilea
2022-06-22 14:38:43 +0200 <maerwald> https://docs.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page
2022-06-22 14:38:52 +0200 <Maxdamantus> aiui Windows has been gradually migrating things from UCS-2 to either UTF-8 or bytes, though I'm not sure about the details.
2022-06-22 14:38:52 +0200 <maerwald> but that isn't supported across all windows versions
2022-06-22 14:38:58 +0200 <maerwald> so that is not a good default
2022-06-22 14:39:07 +0200dsrt^(~dsrt@50.237.44.186)
2022-06-22 14:39:14 +0200fweht(uid404746@id-404746.lymington.irccloud.com)
2022-06-22 14:39:45 +0200 <maerwald> And all that doesn't matter to us. What matters is what the Win32 bindings use, and they use the wide character API: https://hackage.haskell.org/package/Win32
2022-06-22 14:42:12 +0200 <Maxdamantus> Right, but what happens if Windows starts supporting bytes-based filenames? Haskell should be able to switch over to the new API in order to handle them, but it's going to be awkward to do that if doing so means changing all of the `Show` behaviour etc for Windows users.
2022-06-22 14:42:53 +0200 <Maxdamantus> eg, Windows has been adding integration for WSL. I think they're intending on running Android apps etc.
2022-06-22 14:43:37 +0200 <maerwald> Maxdamantus: I don't think Win32 package will migrate to anything else. It will stick to wide character API.
2022-06-22 14:43:58 +0200 <Maxdamantus> theoretically they could decide at some point to offer 8-bit filename APIs which are able to handle Linux filesystems without information loss, and they should also be fully capable of handling existing NTFS filesystems without information loss due to conversion to/from WTF-8.
2022-06-22 14:44:13 +0200 <Maxdamantus> maerwald: I'm not saying they're going to remove the APIs.
2022-06-22 14:44:51 +0200 <Maxdamantus> maerwald: just offer better ones that are usable in all the same cases as the current 16-bit ones, but also handle filenames from 8-bit systems, like WSL or network shares.
2022-06-22 14:45:10 +0200 <maerwald> Maxdamantus: you're going to break Haskell for old windows versions then
2022-06-22 14:46:23 +0200 <Maxdamantus> maerwald: you mean because Haskell has to pick to use either the new API (only supports Windows 13+) or the old API (supports all versions of Windows)?
2022-06-22 14:46:32 +0200 <Maxdamantus> why can't it support both?
2022-06-22 14:47:31 +0200 <maerwald> I don't understand what problem you're trying to solve. Of course it can provide bindings for both variants, but on some windows systems the UTF-8 one will *fail*.
2022-06-22 14:48:08 +0200 <Maxdamantus> It will only fail when creating filenames that are unsupported on a filesystem, but that's already a possibility on Windows.
2022-06-22 14:48:13 +0200 <Maxdamantus> eg, can't create a file called "con"
2022-06-22 14:48:33 +0200 <geekosaur> uh
2022-06-22 14:48:37 +0200 <Maxdamantus> or a file with some special characters in it, can't think of what they are off the top of my head.
2022-06-22 14:48:56 +0200 <maerwald> "As of Windows Version 1903 (May 2019 Update), you can use the ActiveCodePage property in the appxmanifest for packaged apps, or the fusion manifest for unpackaged apps, to force a process to use UTF-8 as the process code page."
2022-06-22 14:49:09 +0200 <geekosaur> so, wide vs. narrow characters are just a bit more intrusive than that
2022-06-22 14:49:24 +0200 <Maxdamantus> anyway, need to go to bed.
2022-06-22 14:49:26 +0200 <Maxdamantus> Thu Jun 23 12:49:26 AM NZST 2022
2022-06-22 14:50:18 +0200 <maerwald> 1. it requires configuration, 2. it doesn't work on all windows versions, 3. it complicates filepath handling
2022-06-22 14:50:20 +0200 <maerwald> what's the gain
2022-06-22 14:59:55 +0200dsrt^(~dsrt@50.237.44.186) (Ping timeout: 256 seconds)
2022-06-22 15:00:30 +0200gurkenglas(~gurkengla@dslb-002-207-014-022.002.207.pools.vodafone-ip.de) (Ping timeout: 276 seconds)
2022-06-22 15:01:48 +0200ridcully(~ridcully@pd951ff85.dip0.t-ipconnect.de) (Ping timeout: 276 seconds)
2022-06-22 15:03:38 +0200juri__(~juri@79.140.115.124) (Ping timeout: 240 seconds)
2022-06-22 15:03:42 +0200renzhi(~xp@2607:fa49:6500:b100::f64a) (Ping timeout: 264 seconds)
2022-06-22 15:04:08 +0200mrd(~mrd@user/mrd)
2022-06-22 15:04:10 +0200dschrempf(~dominik@070-207.dynamic.dsl.fonira.net) (Quit: WeeChat 3.5)
2022-06-22 15:09:12 +0200ChaiTRex(~ChaiTRex@user/chaitrex) (Remote host closed the connection)
2022-06-22 15:10:54 +0200kuribas(~user@ip-188-118-57-242.reverse.destiny.be) (Ping timeout: 264 seconds)
2022-06-22 15:12:16 +0200pleo(~pleo@user/pleo)
2022-06-22 15:14:50 +0200odnes(~odnes@5-203-220-108.pat.nym.cosmote.net) (Remote host closed the connection)
2022-06-22 15:15:09 +0200chexum(~quassel@gateway/tor-sasl/chexum) (Ping timeout: 268 seconds)
2022-06-22 15:15:12 +0200odnes(~odnes@5-203-220-108.pat.nym.cosmote.net)
2022-06-22 15:15:16 +0200ChaiTRex(~ChaiTRex@user/chaitrex)
2022-06-22 15:17:30 +0200chexum(~quassel@gateway/tor-sasl/chexum)
2022-06-22 15:18:26 +0200zebrag(~chris@user/zebrag)
2022-06-22 15:19:06 +0200shriekingnoise(~shrieking@201.212.175.181)
2022-06-22 15:26:34 +0200Unicorn_Princess(~Unicorn_P@93-103-228-248.dynamic.t-2.net)
2022-06-22 15:28:05 +0200toluene(~toluene@user/toulene)
2022-06-22 15:28:12 +0200Infinite(~Infinite@49.39.125.113)
2022-06-22 15:29:48 +0200dsrt^(~dsrt@50.237.44.186)
2022-06-22 15:30:33 +0200juri_(~juri@79.140.115.124)
2022-06-22 15:34:21 +0200crazazy(~user@130.89.171.62)
2022-06-22 15:36:06 +0200vysn(~vysn@user/vysn) (Ping timeout: 264 seconds)
2022-06-22 15:37:29 +0200cfricke(~cfricke@user/cfricke) (Ping timeout: 248 seconds)
2022-06-22 15:37:29 +0200k`(~user@152.1.137.158)
2022-06-22 15:39:23 +0200 <k`> How do I depend on a github repo in my cabal file? Currently I've written a 'source-repository' stanza for it, but cabal fails with 'unknown pagkage'.
2022-06-22 15:39:35 +0200 <merijn> k`: You don't
2022-06-22 15:39:47 +0200 <merijn> k`: You probably want a cabal.project file
2022-06-22 15:40:16 +0200 <k`> merijn: Thanks, I'll open up the docs on that.
2022-06-22 15:41:15 +0200 <k`> Do I use one cabal.project file for the entire project, or do I put one in each package subdir?
2022-06-22 15:42:25 +0200 <sm> one for the project
2022-06-22 15:42:52 +0200dsrt^(~dsrt@50.237.44.186) (Remote host closed the connection)
2022-06-22 15:42:59 +0200 <k`> sm: Thanks. That's what the name seems to imply but I didn't want to make any foolish assumptions.
2022-06-22 15:43:04 +0200 <merijn> k`: So, the distinction is: a .cabal is a standalone description of a specific package (dependencies, flags, etc.) "cabal.project" is for defining the context in which a project (of one or more packages) is being used/built and allows you to override things (like saying to use a local directory or git repo to develop against unreleased code)
2022-06-22 15:43:50 +0200eod|fserucas_(~eod|fseru@193.65.114.89.rev.vodafone.pt) (Quit: Leaving)
2022-06-22 15:44:06 +0200 <k`> So just out of curiosity, how would the individual packages be built when they don't have access to the overall project description?
2022-06-22 15:45:04 +0200 <sm> there doesn't seem to be an introduction to cabal.project in the user guide
2022-06-22 15:45:25 +0200 <merijn> k`: The idea is that individual package (when you release them) only depend on other released packages/versions, not git repos
2022-06-22 15:45:31 +0200 <merijn> sm: There was a WIP to write one
2022-06-22 15:46:07 +0200 <jackdk> the reference is at least thorough, but I'm not aware of any good intros: https://cabal.readthedocs.io/en/3.6/cabal-project.html
2022-06-22 15:46:13 +0200 <sclv> https://cabal.readthedocs.io/en/3.6/cabal-project.html and https://cabal.readthedocs.io/en/3.6/nix-local-build.html#developing-multiple-packages
2022-06-22 15:46:24 +0200 <sclv> the latter of the two i posted is sort of an intro
2022-06-22 15:46:46 +0200 <k`> So, say I have packages 'foo-class', 'foo-pattern', and 'foo-type', with 'foo-pattern' and 'foo-type' both depending on 'foo-class'. Where do I give the repo for 'foo-class' in 'foo-type' so that 'foo-type' can be built independently?
2022-06-22 15:47:10 +0200 <sclv> the packages all are like normal packages. the project file ties them all together
2022-06-22 15:47:20 +0200 <merijn> k`: The idea would be that, eventually foo-class gets released on hackage
2022-06-22 15:47:38 +0200 <merijn> k`: Basically "where to find a package" is NOT something .cabal files are concerned with
2022-06-22 15:47:45 +0200 <merijn> They merely state "what package"
2022-06-22 15:47:49 +0200 <merijn> (and version)
2022-06-22 15:48:16 +0200 <merijn> k`: The implicit context is that "where" is "the package repository (aka Hackage instance) that you happen to point cabal-install at"
2022-06-22 15:48:26 +0200 <k`> merijn: Fair enough. I'm just always hesitant to release anything to Hackage because my code quality is shit.
2022-06-22 15:48:47 +0200 <k`> But I still want to make things properly modular.
2022-06-22 15:48:53 +0200 <sclv> a hackage release is a package tarball. those don't include cabal.project files
2022-06-22 15:48:56 +0200 <merijn> k`: You can run your own hackage and cabal-install can be pointed at a different (and even multiple!) hackages :)
2022-06-22 15:49:07 +0200 <sclv> cabal.project files are _only_ for use in developing a collection of packages from a repo
2022-06-22 15:49:15 +0200 <merijn> k`: So it's perfectly possible to have a personal/company/whatever Hackage repo
2022-06-22 15:49:28 +0200 <merijn> sclv: Not *only* for that
2022-06-22 15:49:29 +0200 <sclv> once you upload to hackage, you should ensure all the deps are already on hackage
2022-06-22 15:50:13 +0200coot(~coot@213.134.190.95)
2022-06-22 15:50:22 +0200motherfsck(~motherfsc@user/motherfsck) (Quit: quit)
2022-06-22 15:51:05 +0200 <sm> ah there it is, https://cabal.readthedocs.io/en/3.6/nix-local-build.html#developing-multiple-packages . The cursed "nix-style" jargon strikes again
2022-06-22 15:51:50 +0200cfricke(~cfricke@user/cfricke)
2022-06-22 15:52:07 +0200motherfsck(~motherfsc@user/motherfsck)
2022-06-22 15:52:10 +0200 <k`> Anyone know what happens when I list a source-repository-package that points to a subdir of a project that uses cabal.project to build? Just fails to build?
2022-06-22 15:52:42 +0200 <sm> (I did search the site for "cabal.project", must have missed the Quickstart)
2022-06-22 15:52:42 +0200 <merijn> k`: What do you mean?
2022-06-22 15:53:24 +0200 <merijn> k`: If you have a cabal.project in a directory for project X, then X and all its dependencies should be findable via that 1 cabal.project
2022-06-22 15:53:57 +0200 <merijn> k`: if you meant "X depends on Y and Y uses cabal.project to find Z", then cabal.project for X needs to include repo pointers for both Y and Z
2022-06-22 15:54:35 +0200 <k`> merijn: Oh, that is good to know.
2022-06-22 15:54:43 +0200 <k`> Would not have expected that.
2022-06-22 15:56:04 +0200 <k`> So if I'm trying to modularize with multiple packages I should throw them all in one huge package repo so they can all find their dependencies, and I don't need to update the cabal.project of Z when a new transitive dependency is added.
2022-06-22 15:56:35 +0200 <merijn> k`: If they're interdependent then I would say yes
2022-06-22 15:56:48 +0200 <sclv> that's a common pattern, yes
2022-06-22 15:56:51 +0200 <merijn> k`: See for example: https://github.com/merijn/broadcast-chan
2022-06-22 15:57:38 +0200 <merijn> k`: Although once you get past, say, 10 packages in the same repo I would start questioning what I'm doing if I were you ;)
2022-06-22 15:57:42 +0200gmg(~user@user/gehmehgeh) (Ping timeout: 268 seconds)
2022-06-22 15:58:56 +0200 <k`> I see that there you give the direct paths to the subdirectories. Is it standard to do that rather than round tripping through github?
2022-06-22 15:59:33 +0200 <merijn> k`: Yes, pointing at github will use whatever is currently on github *NOT* what you have in your local clone
2022-06-22 15:59:34 +0200vglfr(~vglfr@coupling.penchant.volia.net) (Read error: Connection reset by peer)
2022-06-22 15:59:41 +0200gmg(~user@user/gehmehgeh)
2022-06-22 15:59:45 +0200vglfr(~vglfr@coupling.penchant.volia.net)
2022-06-22 15:59:50 +0200 <merijn> k`: Whereas the subdirectories tell it to use whatever is in the local subdirectories *right now*
2022-06-22 15:59:56 +0200 <merijn> Which is probably what you want
2022-06-22 16:00:35 +0200 <merijn> (because if your locally changing foo-class, you probably want local versions of foo-instance to pick that up :p)
2022-06-22 16:01:40 +0200Guest59(~Guest59@148.253.134.213)
2022-06-22 16:02:10 +0200Guest59(~Guest59@148.253.134.213) (Client Quit)
2022-06-22 16:02:20 +0200Infinite(~Infinite@49.39.125.113) (Quit: Client closed)
2022-06-22 16:02:41 +0200Infinite(~Infinite@49.39.125.113)
2022-06-22 16:02:55 +0200mecharyuujin(~mecharyuu@2409:4050:ece:7592:439d:86ae:5a53:fec7)
2022-06-22 16:03:00 +0200pleo(~pleo@user/pleo) (Quit: quit)
2022-06-22 16:04:44 +0200 <mecharyuujin> Heya, beginner here, am learning Haskell using the Learn You a Haskell tutorial
2022-06-22 16:05:04 +0200 <mecharyuujin> head'' :: [a] -> a
2022-06-22 16:05:11 +0200 <mecharyuujin> head'' = foldr1 (\x _ -> x)
2022-06-22 16:05:26 +0200 <mecharyuujin> why does this version of head work on infinite lists?
2022-06-22 16:06:57 +0200 <mecharyuujin> I thought foldr1 would need the last element of the list as the starting value, and even though it is useless in this case, how would GHC know that its useless here? Is GHC able to figure it out?
2022-06-22 16:09:08 +0200 <k`> merijn, sclv, sm, thank you so much. I think I know what to do now.
2022-06-22 16:09:35 +0200 <merijn> mecharyuujin: Consider this: Can you rewrite the application of head'' by replacing "foldr1" with the definition of foldr1?
2022-06-22 16:09:47 +0200 <merijn> i.e. take foldr1, turn it into a lambda, insert in the code for head''
2022-06-22 16:09:57 +0200albet70(~xxx@2400:8902::f03c:92ff:fe60:98d8) (Remote host closed the connection)
2022-06-22 16:10:10 +0200 <k`> Maybe if I ever get better at programming some of this will make it to Hackage. But considering how little I've improved in the last 14 years of using in Haskell, it seems unlikely :-)
2022-06-22 16:12:06 +0200Vajb(~Vajb@2001:999:40:4c50:1b24:879c:6df3:1d06) (Read error: Connection reset by peer)
2022-06-22 16:12:12 +0200 <sm> it'll start to flow one of these days!
2022-06-22 16:12:51 +0200Vajb(~Vajb@hag-jnsbng11-58c3a8-176.dhcp.inet.fi)
2022-06-22 16:15:01 +0200 <carbolymer> do you know if I can somehow put seed into hedgehog to not have random output from generators?
2022-06-22 16:15:14 +0200 <carbolymer> or into tasty
2022-06-22 16:15:53 +0200pleo(~pleo@user/pleo)
2022-06-22 16:16:04 +0200albet70(~xxx@2400:8902::f03c:92ff:fe60:98d8)
2022-06-22 16:17:41 +0200 <mecharyuujin> merijn, I am not sure how I would turn foldr1 into a lambda. If I had to implement it, I would probably do it like
2022-06-22 16:17:45 +0200jakalx(~jakalx@base.jakalx.net) (Error from remote client)
2022-06-22 16:17:47 +0200 <mecharyuujin> foldr1 f [x] = x
2022-06-22 16:17:55 +0200 <mecharyuujin> foldr1 f (x:xs) = f x (foldr1 xs)
2022-06-22 16:20:05 +0200jakalx(~jakalx@base.jakalx.net)
2022-06-22 16:21:50 +0200 <merijn> ok, so let's fill in the lambda from head'' in that code
2022-06-22 16:22:07 +0200 <merijn> Clearly in the first case it will return the first item, yeah?
2022-06-22 16:22:17 +0200 <merijn> So, let's look at the 2nd case
2022-06-22 16:22:28 +0200 <mecharyuujin> yeah
2022-06-22 16:22:30 +0200 <merijn> foldr1 f (x:xs) = f x (foldr1 f xs)
2022-06-22 16:23:10 +0200 <merijn> Let's rename in head'' to get: head'' = foldr1 (\y _ -> y)
2022-06-22 16:23:15 +0200 <merijn> (remove some name confusion)
2022-06-22 16:24:06 +0200 <merijn> Actually, let's eta expand too: head'' (x:xs) = foldr1 (\y _ -> y) (x:xs)
2022-06-22 16:24:46 +0200 <merijn> Expand foldr1 using it's definition and we get: (\y _ -> y) x (foldr1 (\y _ -> y) xs)
2022-06-22 16:25:03 +0200 <mecharyuujin> Ah, I see
2022-06-22 16:25:05 +0200 <merijn> Which will obviously return 'x' (so the head of the list)
2022-06-22 16:25:06 +0200 <mecharyuujin> this is simply x
2022-06-22 16:25:15 +0200 <mecharyuujin> Thanks a ton merijn, !
2022-06-22 16:25:29 +0200 <merijn> And laziness means we only evaluate the 2nd argument (the recursive foldr1 call) when needed (i.e. never)
2022-06-22 16:25:38 +0200 <mecharyuujin> yeah
2022-06-22 16:30:10 +0200eggplantade(~Eggplanta@108-201-191-115.lightspeed.sntcca.sbcglobal.net)
2022-06-22 16:32:44 +0200waleee(~waleee@2001:9b0:213:7200:cc36:a556:b1e8:b340) (Ping timeout: 255 seconds)
2022-06-22 16:33:10 +0200Infinite(~Infinite@49.39.125.113) (Ping timeout: 252 seconds)
2022-06-22 16:33:13 +0200fnurglewitz(uid263868@id-263868.lymington.irccloud.com) (Quit: Connection closed for inactivity)
2022-06-22 16:33:42 +0200HotblackDesiato(~HotblackD@gateway/tor-sasl/hotblackdesiato) (Remote host closed the connection)
2022-06-22 16:33:58 +0200HotblackDesiato(~HotblackD@gateway/tor-sasl/hotblackdesiato)
2022-06-22 16:34:41 +0200eggplantade(~Eggplanta@108-201-191-115.lightspeed.sntcca.sbcglobal.net) (Ping timeout: 268 seconds)
2022-06-22 16:35:43 +0200Sgeo(~Sgeo@user/sgeo)
2022-06-22 16:35:45 +0200 <mecharyuujin> How is foldl/foldl1 implemented? Using (init xs) and (last xs) doesn't seem particularly efficient...
2022-06-22 16:39:00 +0200ccntrq1(~Thunderbi@exit-1.office.han.de.mhd.medondo.com)
2022-06-22 16:39:03 +0200 <k`> mecharyuujin: foldl folds from right to left, starting with the accumulator value.
2022-06-22 16:39:30 +0200 <k`> Think of it as a loop onto an accumulator rather than a fold like foldr.
2022-06-22 16:39:42 +0200 <k`> Sorry, left to right.
2022-06-22 16:39:50 +0200 <k`> Just like foldr.
2022-06-22 16:41:13 +0200Timely_Ratio9567(~mecharyuu@2405:204:302a:37df:1901:27c8:4070:e6e2)
2022-06-22 16:41:58 +0200ccntrq(~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Ping timeout: 240 seconds)
2022-06-22 16:42:01 +0200Timely_Ratio9567(~mecharyuu@2405:204:302a:37df:1901:27c8:4070:e6e2) (Client Quit)
2022-06-22 16:42:12 +0200Timely_Ratio9567(~mecharyuu@2405:204:302a:37df:1901:27c8:4070:e6e2)
2022-06-22 16:43:16 +0200Timely_Ratio9567(~mecharyuu@2405:204:302a:37df:1901:27c8:4070:e6e2) (Client Quit)
2022-06-22 16:43:23 +0200ccntrq(~Thunderbi@exit-1.office.han.de.mhd.medondo.com)
2022-06-22 16:43:28 +0200Timely_Ratio9567(~mecharyuu@2405:204:302a:37df:1901:27c8:4070:e6e2)
2022-06-22 16:43:29 +0200Timely_Ratio9567(~mecharyuu@2405:204:302a:37df:1901:27c8:4070:e6e2) (Remote host closed the connection)
2022-06-22 16:43:32 +0200mecharyuujin(~mecharyuu@2409:4050:ece:7592:439d:86ae:5a53:fec7) (Ping timeout: 255 seconds)
2022-06-22 16:43:37 +0200ccntrq1(~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Ping timeout: 248 seconds)
2022-06-22 16:45:18 +0200mecharyuujin(~mecharyuu@2405:204:302a:37df:1901:27c8:4070:e6e2)
2022-06-22 16:45:50 +0200_73(~user@pool-108-49-252-36.bstnma.fios.verizon.net) (Remote host closed the connection)
2022-06-22 16:48:31 +0200Timely_Ratio9567(~mecharyuu@2409:4050:2d4b:a853:8048:c716:f88e:d09f)
2022-06-22 16:51:05 +0200mecharyuujin(~mecharyuu@2405:204:302a:37df:1901:27c8:4070:e6e2) (Ping timeout: 248 seconds)
2022-06-22 16:51:42 +0200eggplantade(~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e)
2022-06-22 16:59:36 +0200Timely_Ratio9567(~mecharyuu@2409:4050:2d4b:a853:8048:c716:f88e:d09f) (Quit: Leaving)
2022-06-22 17:03:50 +0200lortabac(~lortabac@2a01:e0a:541:b8f0:2cd:7ecf:235f:1481) (Quit: WeeChat 2.8)
2022-06-22 17:10:13 +0200 <tomsmeding> @src foldl
2022-06-22 17:10:13 +0200 <lambdabot> foldl f z [] = z
2022-06-22 17:10:13 +0200 <lambdabot> foldl f z (x:xs) = foldl f (f z x) xs
2022-06-22 17:13:44 +0200stackdroid18(14094@user/stackdroid)
2022-06-22 17:24:59 +0200Unicorn_Princess(~Unicorn_P@93-103-228-248.dynamic.t-2.net) (Remote host closed the connection)
2022-06-22 17:28:05 +0200fweht(uid404746@id-404746.lymington.irccloud.com) (Quit: Connection closed for inactivity)
2022-06-22 17:29:31 +0200ccntrq(~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Remote host closed the connection)
2022-06-22 17:31:37 +0200eggplantade(~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) (Remote host closed the connection)
2022-06-22 17:33:42 +0200chele(~chele@user/chele) (Remote host closed the connection)
2022-06-22 17:33:54 +0200eggplantade(~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e)
2022-06-22 17:36:02 +0200mc47(~mc47@xmonad/TheMC47) (Remote host closed the connection)
2022-06-22 17:37:32 +0200Vajb(~Vajb@hag-jnsbng11-58c3a8-176.dhcp.inet.fi) (Read error: Connection reset by peer)
2022-06-22 17:37:43 +0200Vajb(~Vajb@hag-jnsbng11-58c3a8-176.dhcp.inet.fi)
2022-06-22 17:38:10 +0200Vajb(~Vajb@hag-jnsbng11-58c3a8-176.dhcp.inet.fi) (Read error: Connection reset by peer)
2022-06-22 17:38:50 +0200Vajb(~Vajb@hag-jnsbng11-58c3a8-176.dhcp.inet.fi)
2022-06-22 17:44:19 +0200mbuf(~Shakthi@122.164.15.152) (Quit: Leaving)
2022-06-22 17:45:22 +0200_xor(~xor@74.215.182.83)
2022-06-22 17:45:41 +0200Surobaki(~surobaki@137.44.222.80) (Read error: Connection reset by peer)
2022-06-22 17:47:31 +0200haritz(~hrtz@user/haritz) (Remote host closed the connection)
2022-06-22 17:49:50 +0200ridcully(~ridcully@pd951f3bf.dip0.t-ipconnect.de)
2022-06-22 17:51:37 +0200MajorBiscuit(~MajorBisc@wlan-145-94-167-213.wlan.tudelft.nl) (Ping timeout: 256 seconds)
2022-06-22 17:55:12 +0200werneta(~werneta@70-142-214-115.lightspeed.irvnca.sbcglobal.net) (Ping timeout: 260 seconds)
2022-06-22 17:57:06 +0200merijn(~merijn@c-001-001-018.client.esciencecenter.eduvpn.nl) (Ping timeout: 264 seconds)
2022-06-22 18:00:10 +0200shlevy[m](~shlevymat@2001:470:69fc:105::1:d3b1) (Quit: You have been kicked for being idle)
2022-06-22 18:03:53 +0200 <sm> I give up. How do you get the current system locale ?
2022-06-22 18:04:05 +0200 <sm> or time locale ?
2022-06-22 18:05:25 +0200lagash(lagash@lagash.shelltalk.net)
2022-06-22 18:05:40 +0200 <geekosaur> afai8k you have to use the old-locale package to get the time locale. not sure about system locale unless it's buried in GHC.IO somewhere
2022-06-22 18:06:41 +0200nate4(~nate@98.45.169.16)
2022-06-22 18:06:42 +0200cfricke(~cfricke@user/cfricke) (Ping timeout: 264 seconds)
2022-06-22 18:06:58 +0200 <sm> that provides https://hackage.haskell.org/package/time-1.13/docs/Data-Time-Format.html#v:defaultTimeLocale , "Locale representing American usage." I'm not sure what that means now
2022-06-22 18:08:07 +0200wagle(~wagle@quassel.wagle.io) (Quit: http://quassel-irc.org - Chat comfortably. Anywhere.)
2022-06-22 18:08:14 +0200 <sm> even though I'm using it plenty
2022-06-22 18:08:23 +0200pavonia(~user@user/siracusa) (Quit: Bye!)
2022-06-22 18:08:37 +0200wagle(~wagle@quassel.wagle.io)
2022-06-22 18:09:03 +0200 <sm> alright, yes that's a constant. I want what's currently set eg with LC_TIME
2022-06-22 18:09:15 +0200eggplantade(~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) (Remote host closed the connection)
2022-06-22 18:09:17 +0200terrorjack(~terrorjac@2a01:4f8:1c1e:509a::1) (Quit: The Lounge - https://thelounge.chat)
2022-06-22 18:10:25 +0200odnes(~odnes@5-203-220-108.pat.nym.cosmote.net) (Remote host closed the connection)
2022-06-22 18:10:47 +0200odnes(~odnes@5-203-220-108.pat.nym.cosmote.net)
2022-06-22 18:11:30 +0200nate4(~nate@98.45.169.16) (Ping timeout: 264 seconds)
2022-06-22 18:13:32 +0200 <sm> https://stackoverflow.com/questions/28077322/getting-the-date-format-for-the-current-locale recommends current-locale (from 2015). I had tried this, but its TimeLocale is incompatible.. so not very useful. Strange..
2022-06-22 18:14:26 +0200 <tomsmeding> sm: the C way would be to call setlocale(LC_ALL, NULL), I guess you could bind that manually
2022-06-22 18:14:53 +0200 <tomsmeding> the ghc repo (hence base) doesn't contain any relevant calls to setlocale, and neither 'time' nor 'old-locale' have any hits when searching for setlocale in the git repo
2022-06-22 18:15:14 +0200 <tomsmeding> side note, "The setlocale() function is used to set or query the program's current locale." illustrates the great naming of that function
2022-06-22 18:15:19 +0200 <sm> does it mean that basically no haskell programs are aware of system time locale, eg for parsing/printing localised month names ?
2022-06-22 18:15:48 +0200 <geekosaur> xmonad binds setlocale but only to force it to locale "C"
2022-06-22 18:20:30 +0200pleo(~pleo@user/pleo) (Ping timeout: 264 seconds)
2022-06-22 18:21:12 +0200AndrewGNU\Andrew
2022-06-22 18:21:26 +0200 <tomsmeding> also found this interesting library: https://hackage.haskell.org/package/env-locale-1.0.0.1/docs/src/System-Locale-Current.html#current…
2022-06-22 18:21:49 +0200 <tomsmeding> the funny thing being, that 'prepare_locale' binds to the function at the top here https://hackage.haskell.org/package/env-locale-1.0.0.1/src/cbits/glue.c
2022-06-22 18:22:13 +0200 <tomsmeding> oh wait I misread the manpage, disregard
2022-06-22 18:22:42 +0200 <tomsmeding> sm: did you check that library already, or is the TimeLocale of that thing incompatible too?
2022-06-22 18:23:26 +0200merijn(~merijn@c-001-001-018.client.esciencecenter.eduvpn.nl)
2022-06-22 18:24:39 +0200 <tomsmeding> the returned knownTimeZones is bogus though
2022-06-22 18:26:20 +0200 <sm> tomsmeding: no I hadn't seen that one
2022-06-22 18:26:37 +0200HotblackDesiato(~HotblackD@gateway/tor-sasl/hotblackdesiato) (Remote host closed the connection)
2022-06-22 18:27:26 +0200 <sm> looking closer, current-locale's is I guess the TimeLocale defined by old-locale, but Data.Time.Format expects the one defined by time. So I guess current-locale needs an update to use that
2022-06-22 18:27:33 +0200HotblackDesiato(~HotblackD@gateway/tor-sasl/hotblackdesiato)
2022-06-22 18:28:16 +0200 <sm> env-locale's looks like the right one
2022-06-22 18:28:29 +0200 <tomsmeding> Found using hackage search for "locale" :)
2022-06-22 18:30:05 +0200yauhsien_(~yauhsien@61-231-23-53.dynamic-ip.hinet.net)
2022-06-22 18:30:05 +0200yauhsien(~yauhsien@61-231-23-53.dynamic-ip.hinet.net) (Read error: Connection reset by peer)
2022-06-22 18:30:45 +0200 <sm> thanks tomsmeding
2022-06-22 18:34:52 +0200jespada(~jespada@cpc121022-nmal24-2-0-cust171.19-2.cable.virginm.net) (Ping timeout: 260 seconds)
2022-06-22 18:35:09 +0200dlbh^(~dlbh@50.237.44.186)
2022-06-22 18:36:05 +0200Feuermagier_(~Feuermagi@138.199.36.237) (Quit: Leaving)
2022-06-22 18:36:16 +0200Feuermagier(~Feuermagi@user/feuermagier)
2022-06-22 18:36:34 +0200sjanssen(~sjanssenm@2001:470:69fc:105::1:61d8)
2022-06-22 18:37:17 +0200 <tomsmeding> sm: I guess part of the problem is that e.g. I have my system set to en_US.UTF8 despite there being an ocean between us
2022-06-22 18:37:21 +0200eggplantade(~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e)
2022-06-22 18:37:50 +0200 <tomsmeding> Gimme the original text please, don't go translating my compiler errors
2022-06-22 18:38:23 +0200 <tomsmeding> I've seen gcc errors getting translated on another person's machine and boy is that awkward
2022-06-22 18:38:58 +0200cheater1__(~Username@user/cheater)
2022-06-22 18:39:00 +0200 <tomsmeding> Even apart from the fact that some errors aren't translated, and the flags aren't, etc
2022-06-22 18:39:05 +0200jespada(~jespada@cpc121022-nmal24-2-0-cust171.19-2.cable.virginm.net)
2022-06-22 18:39:06 +0200cheater(~Username@user/cheater) (Ping timeout: 264 seconds)
2022-06-22 18:39:11 +0200cheater1__cheater
2022-06-22 18:39:33 +0200 <sm> my context: someone wants hledger to parse their CSV dates correctly with %b recognising "abr" as april
2022-06-22 18:39:58 +0200 <tomsmeding> Okay that makes a lot of sense, but I wouldn't want that to be dependent on the system locale
2022-06-22 18:40:26 +0200 <sm> using the system locale would be the best default, no ?
2022-06-22 18:40:31 +0200 <tomsmeding> Then you get excel-like shenanigand where soms systems want SUM(a, b) and others SUM(a; b), never mind SOM(a; b) in NL
2022-06-22 18:40:58 +0200 <tomsmeding> I'd want en_US to be the default for consistency and reproducibility
2022-06-22 18:41:11 +0200 <tomsmeding> But then as said I have my system locale set to that anyway :p
2022-06-22 18:41:49 +0200 <tomsmeding> The thing being that if it recognises abr as April, then it doesn't recognise apr anymore (presumably)
2022-06-22 18:41:57 +0200 <sm> well, I hear that. I was the same way about UTF8 (but lately I discovered I didn't enforce that from the start and people are reading with bizarro system encodings)
2022-06-22 18:42:13 +0200 <sm> like "latin-1"
2022-06-22 18:42:44 +0200 <k`> sm: I'm so sorry to hear that.
2022-06-22 18:43:04 +0200 <tomsmeding> Is there a "system encoding", and is that set to latin-1 in those cases?
2022-06-22 18:43:42 +0200 <sm> yes, I'm afraid there is and it is
2022-06-22 18:44:08 +0200 <tomsmeding> ._.
2022-06-22 18:44:09 +0200 <sm> tomsmeding: just to be clear, you'd favour sticking with en_US as default, but allowing user to override it at run time ?
2022-06-22 18:44:37 +0200 <tomsmeding> Yes, and same for UTF8 actually - but apparentlt that ship has sailed. But this is just my opinion :)
2022-06-22 18:44:46 +0200 <sm> and when I say people, I mean one guy.
2022-06-22 18:45:01 +0200 <tomsmeding> :p
2022-06-22 18:45:17 +0200 <tomsmeding> There are also people still running windows xp
2022-06-22 18:45:28 +0200 <sm> revisiting the UTF8 thing is actually the current top priority hledger issue. But I'm taking a break as I got sick of it :)
2022-06-22 18:46:46 +0200 <tomsmeding> Might even have an environment variable that instructs hledger to use a particular (or the system) locale, so that one doesn't have to set that each time, or to use a shell alias
2022-06-22 18:46:51 +0200 <tomsmeding> But yes
2022-06-22 18:47:18 +0200smguesses encoding and time locale should probably handled the same way, whatever that is
2022-06-22 18:47:19 +0200 <k`> Hecate: Thoughts on Haskell parsing locale and then letting you write `classe Traversable (Soit c) ou traverser f = soit (pur . Gauche) (fmap Droite . f)` ?
2022-06-22 18:47:40 +0200 <tomsmeding> Understandable to get sick from locales and encodings, my burn with locales was when my (C++) code started failing to parse my save files when I added a user interface
2022-06-22 18:48:06 +0200even4void(even4void@came.here.for-some.fun) (Quit: fBNC - https://bnc4free.com)
2022-06-22 18:48:06 +0200xacktm(xacktm@user/xacktm) (Quit: fBNC - https://bnc4free.com)
2022-06-22 18:48:38 +0200 <tomsmeding> Turned out that that system had an nl_NL locale set for numeric, and my file format used floats, and the gtk library calls setlocale(LC_ALL, "") -- previously I'd unknowingly been running in the default, namely C
2022-06-22 18:49:15 +0200 <sm> lovely
2022-06-22 18:49:25 +0200leeb(~leeb@KD106155002239.au-net.ne.jp) (Ping timeout: 256 seconds)
2022-06-22 18:49:37 +0200econo(uid147250@user/econo)
2022-06-22 18:49:37 +0200andreas303(andreas303@ip227.orange.bnc4free.com) (Quit: fBNC - https://bnc4free.com)
2022-06-22 18:50:45 +0200 <tomsmeding> (we use , for decimals over here)
2022-06-22 18:51:21 +0200 <k`> I have 'current format' set to English(Sweden). Wonder what that's subtly messing up.
2022-06-22 18:51:45 +0200 <yushyin> tomsmeding: en_IE.UTF-8 is my preferred locale :)
2022-06-22 18:57:30 +0200 <tomsmeding> yushyin: why specifically IE? (Are you in Ireland?)
2022-06-22 18:58:05 +0200vysn(~vysn@user/vysn)
2022-06-22 18:58:28 +0200gurkenglas(~gurkengla@dslb-002-207-014-022.002.207.pools.vodafone-ip.de)
2022-06-22 19:00:43 +0200 <tomsmeding> I've heard that en_DK is ideal because they apparently use the yyyy-mm-dd date format — if I don't misremember
2022-06-22 19:00:55 +0200jakalx(~jakalx@base.jakalx.net) (Error from remote client)
2022-06-22 19:00:59 +0200 <k`> Sweden does too.
2022-06-22 19:01:14 +0200coot(~coot@213.134.190.95) (Quit: coot)
2022-06-22 19:01:29 +0200 <tomsmeding> Ah
2022-06-22 19:02:13 +0200 <tomsmeding> Can you guys please convert the rest of the world
2022-06-22 19:02:57 +0200 <k`> Sorry, I'm just using the Sweding locale to get yyy-mm-dd!
2022-06-22 19:03:05 +0200 <k`> *Swedish
2022-06-22 19:04:02 +0200jakalx(~jakalx@base.jakalx.net)
2022-06-22 19:04:15 +0200 <yushyin> tomsmeding: i wanted something that uses the metric system, sane date format i.e. dd/mm/yyyy and '.' for decimal separator. en_IE was the first thing I came across that fulfilled these conditions
2022-06-22 19:04:18 +0200notzmv(~zmv@user/notzmv) (Ping timeout: 240 seconds)
2022-06-22 19:07:37 +0200brettgilio(~brettgili@virtlab.gq) (Ping timeout: 248 seconds)
2022-06-22 19:07:50 +0200lisbeths(uid135845@id-135845.lymington.irccloud.com) (Quit: Connection closed for inactivity)
2022-06-22 19:11:01 +0200 <sm> ha
2022-06-22 19:11:49 +0200 <sm> I like en_IE too, but I no longer think dd/mm/yyyy is the greatest format
2022-06-22 19:11:56 +0200dlbh^(~dlbh@50.237.44.186) (Ping timeout: 268 seconds)
2022-06-22 19:12:45 +0200 <EvanR> dy/ym/dyym to keep things spicy
2022-06-22 19:12:53 +0200andreas303(andreas303@ip227.orange.bnc4free.com)
2022-06-22 19:13:44 +0200smactually tried to parse that... day of year... year month... day of <explodes>
2022-06-22 19:13:48 +0200 <k`> EvanR: Can I get you on board with 3-space indents and comments in Interlingua as a standard?
2022-06-22 19:14:43 +0200 <yushyin> yyyy-mm-dd is indeed more fancy, but currency with en_DK is DKK and with en_IE it is EUR
2022-06-22 19:14:51 +0200yauhsien_(~yauhsien@61-231-23-53.dynamic-ip.hinet.net) (Remote host closed the connection)
2022-06-22 19:15:18 +0200 <k`> You can set LC_MONETARY to something different than LC_TIME.
2022-06-22 19:15:44 +0200 <EvanR> forgot about interlingua
2022-06-22 19:16:44 +0200unit73e(~emanuel@2001:818:e8dd:7c00:32b5:c2ff:fe6b:5291)
2022-06-22 19:17:16 +0200vhs(~vhs@c188-151-104-121.bredband.tele2.se)
2022-06-22 19:17:48 +0200 <yushyin> k`: i know! but it was nice to find a locale that more or less is good enough without much mixing different locales
2022-06-22 19:19:01 +0200 <EvanR> en_STATELESS_AND_LOVIN_IT
2022-06-22 19:19:24 +0200 <yushyin> :D
2022-06-22 19:20:37 +0200even4void(even4void@came.here.for-some.fun)
2022-06-22 19:22:46 +0200 <k`> Glad you can put a positive spin on it...
2022-06-22 19:24:58 +0200vhs(~vhs@c188-151-104-121.bredband.tele2.se) (Quit: Leaving)
2022-06-22 19:25:31 +0200azimut(~azimut@gateway/tor-sasl/azimut) (Ping timeout: 268 seconds)
2022-06-22 19:25:47 +0200vhs(~vhs@c188-151-104-121.bredband.tele2.se)
2022-06-22 19:26:14 +0200xacktm(xacktm@user/xacktm)
2022-06-22 19:26:45 +0200stiell(~stiell@gateway/tor-sasl/stiell) (Ping timeout: 268 seconds)
2022-06-22 19:27:25 +0200 <monochrom> day of <explode> = dies irae >:)
2022-06-22 19:27:37 +0200tzh(~tzh@c-24-21-73-154.hsd1.or.comcast.net)
2022-06-22 19:28:21 +0200vhs(~vhs@c188-151-104-121.bredband.tele2.se) (Client Quit)
2022-06-22 19:28:58 +0200vhs(~vhs@c188-151-104-121.bredband.tele2.se)
2022-06-22 19:30:01 +0200jinsun(~jinsun@user/jinsun) (Ping timeout: 248 seconds)
2022-06-22 19:30:09 +0200 <geekosaur> EvanR: d₂y₃/y₂m₂/d₁y₄y₁m₁
2022-06-22 19:30:28 +0200vhs(~vhs@c188-151-104-121.bredband.tele2.se) (Client Quit)
2022-06-22 19:31:07 +0200vhs(~vhs@c188-151-104-121.bredband.tele2.se)
2022-06-22 19:31:27 +0200stiell(~stiell@gateway/tor-sasl/stiell)
2022-06-22 19:34:49 +0200vhs(~vhs@c188-151-104-121.bredband.tele2.se) (Client Quit)
2022-06-22 19:37:53 +0200jinsun(~jinsun@user/jinsun)
2022-06-22 19:38:25 +0200Everything(~Everythin@37.115.210.35) (Quit: leaving)
2022-06-22 19:38:25 +0200 <EvanR> nice, spontaneous symmetry breaking
2022-06-22 19:41:02 +0200eggplantade(~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) (Remote host closed the connection)
2022-06-22 19:41:16 +0200eggplantade(~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e)
2022-06-22 19:41:19 +0200dlbh^(~dlbh@50.237.44.186)
2022-06-22 19:41:57 +0200mjs22(~mjs22@76.115.19.239)
2022-06-22 19:42:51 +0200yauhsien(~yauhsien@61-231-23-53.dynamic-ip.hinet.net)
2022-06-22 19:48:55 +0200waleee(~waleee@2001:9b0:213:7200:cc36:a556:b1e8:b340)
2022-06-22 19:52:40 +0200Unicorn_Princess(~Unicorn_P@93-103-228-248.dynamic.t-2.net)
2022-06-22 19:55:03 +0200Infinite(~Infinite@2405:204:5381:d6e2:c80:a1c9:d209:de50)
2022-06-22 19:57:37 +0200raym(~raym@user/raym) (Remote host closed the connection)
2022-06-22 19:59:58 +0200raehik(~raehik@cpc95906-rdng25-2-0-cust156.15-3.cable.virginm.net) (Ping timeout: 240 seconds)
2022-06-22 20:02:29 +0200_ht(~quassel@231-169-21-31.ftth.glasoperator.nl)
2022-06-22 20:04:26 +0200raym(~raym@user/raym)
2022-06-22 20:05:41 +0200 <shapr> I published my first thing to hackage yay! https://hackage.haskell.org/package/takedouble
2022-06-22 20:11:46 +0200 <tomsmeding> shapr: nice and compact :)
2022-06-22 20:11:54 +0200 <cjay> nice, congrats :)
2022-06-22 20:11:58 +0200shaprdances cheerfully
2022-06-22 20:21:13 +0200vysn(~vysn@user/vysn) (Ping timeout: 248 seconds)
2022-06-22 20:21:54 +0200eggplantade(~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) (Remote host closed the connection)
2022-06-22 20:32:59 +0200__monty__(~toonn@user/toonn)
2022-06-22 20:33:22 +0200misterfish(~misterfis@ip214-130-173-82.adsl2.static.versatel.nl)
2022-06-22 20:37:03 +0200 <shapr> When I ran "cabal check" I got a warning that users may not need "-O2" for my package, how do I test whether it makes a difference?
2022-06-22 20:37:18 +0200Infinite9(~Infinite@2405:204:5381:d6e2:c147:f74f:65d9:3fcf)
2022-06-22 20:37:22 +0200Infinite(~Infinite@2405:204:5381:d6e2:c80:a1c9:d209:de50) (Ping timeout: 252 seconds)
2022-06-22 20:37:44 +0200 <shapr> Is there perhaps a known criterion workflow that can tell me?
2022-06-22 20:38:31 +0200 <Infinite9> I'm trying to understand this line: dirs@Dirs{..} <- getAllDirs.
2022-06-22 20:38:32 +0200 <Infinite9> The <- gets me Dir from IO Dir. Then Dirs{..} destructures so that we don't need to specify all the elements of the record. But I don't understand the @ here. I tried looking it up and visible type applications came up. If that's so, is the '@' conforming the type of dirs to Dir? https://pastebin.com/Ck4tWmBs
2022-06-22 20:38:59 +0200 <shapr> Infinite9: the entire result is assigned to the name 'dirs'
2022-06-22 20:39:29 +0200 <monochrom> Look for "as patterns" instead. This is just Haskell 2010 (and 98, and ...)
2022-06-22 20:40:07 +0200 <monochrom> This is also why Google is still not sentient.
2022-06-22 20:41:32 +0200 <Infinite9> monochrom thanks this helped
2022-06-22 20:41:32 +0200 <Infinite9> Actually, I just randomly entered 'a@b' in ghci and it said  "Did you mean to enable TypeApplications?" so I tried looking that up.
2022-06-22 20:41:57 +0200 <EvanR> oof
2022-06-22 20:42:18 +0200 <k`> Think I'm in a very small minority here, but for that and a few other reasons I am not a fan of type applications.
2022-06-22 20:42:35 +0200 <k`> Would much rather write (a :: b).
2022-06-22 20:42:52 +0200 <EvanR> @ is doing double duty here
2022-06-22 20:43:03 +0200 <geekosaur> not fond of them either. some people seem to love them, others consider them a mistake
2022-06-22 20:43:37 +0200 <geekosaur> as patterns didn't get mentioned in ghci because you were in an expression as far as ghci was concerned, whereas as-patterns are part of pattern syntax
2022-06-22 20:44:04 +0200 <EvanR> someone go back in time and increase the universe of ascii characters slightly
2022-06-22 20:44:44 +0200maerwald(~maerwald@user/maerwald) (Ping timeout: 255 seconds)
2022-06-22 20:46:39 +0200 <geekosaur> shapr, just time it with and without. beware that -O2 can actually slow things down in some cases
2022-06-22 20:46:48 +0200 <monochrom> Oh ghci is not sentient either.
2022-06-22 20:47:11 +0200 <geekosaur> so cabal strongly encourages you to use -O / -O1 instead
2022-06-22 20:47:21 +0200 <monochrom> This is why I am against error messages doing second-guessing.
2022-06-22 20:47:36 +0200 <geekosaur> and the ghc manual tells you -O2 is usually wasted time both in compilation and runni8ng
2022-06-22 20:49:05 +0200 <int-e> hmm, is that true though?
2022-06-22 20:49:26 +0200 <int-e> (The latter; the former... ugh, please leave trading compilation time for runtime to the user!)
2022-06-22 20:49:47 +0200 <geekosaur> more specifically what it says is it usually slows compilation significantly while providing little if any benefit and occasionally making things worse, iirc
2022-06-22 20:50:25 +0200 <int-e> I should do my own profiling. Not saying that I will...
2022-06-22 20:51:54 +0200Pickchea(~private@user/pickchea)
2022-06-22 20:54:03 +0200 <monochrom> I just idly wonder if the GHC user's guide is outdated on this.
2022-06-22 20:54:35 +0200 <int-e> Well, maybe a sample: one random and tiny program sees a speedup of 15% from using -O2. And it takes just enough time for the runtime improvement to outweight the extra compilation time.
2022-06-22 20:55:01 +0200 <int-e> (compile + execute is in the 2.5s ballpark for this sample)
2022-06-22 20:55:02 +0200 <k`> I just idly wonder if `lens` is the package that most benefits from -O2, and yet is the package you least want to spend more time compiling.
2022-06-22 20:55:43 +0200 <monochrom> haha
2022-06-22 20:55:45 +0200 <int-e> k`: have you tried building regex-tdfa or haskell-src-exts?
2022-06-22 20:55:46 +0200 <k`> (Much love to Ed K. for making one of the most beautiful, useful packages on Hackage.)
2022-06-22 20:56:24 +0200 <k`> int-e: No. Are you saying that a regex library takes a long time to compile?
2022-06-22 20:56:36 +0200 <int-e> this particular one does, IME
2022-06-22 20:56:54 +0200 <int-e> I never looked into it though.
2022-06-22 20:57:02 +0200 <int-e> (So I don't know why)
2022-06-22 20:57:25 +0200 <monochrom> My idle wonder cuts both ways. bytestring and vector needed -O2 a decade ago. I also idly wonder whether today they still do.
2022-06-22 20:57:57 +0200maerwald(~maerwald@mail.hasufell.de)
2022-06-22 20:58:27 +0200 <monochrom> Forgive me for not even asking in #ghc, today is a hot day and I feel like chilling out and slacking off :)
2022-06-22 21:00:16 +0200 <monochrom> But I'm happy enough that -O1 already does wonder and is the cabal default.
2022-06-22 21:00:37 +0200 <dolio> I don't really understand why it matters much for those examples.
2022-06-22 21:01:12 +0200 <dolio> If you're working on them, then I understand caring. But people using them recompile them like twice a year.
2022-06-22 21:01:44 +0200 <monochrom> So in the case of bytestring and vector, my recollection is that -O2 turns on the last mile of aggressive fusion that they direly need.
2022-06-22 21:02:51 +0200 <monochrom> So my guess is that takedouble does not need -O2.
2022-06-22 21:03:27 +0200 <monochrom> takedouble is I/O-bound. It probably spends more time waiting for the OS.
2022-06-22 21:03:31 +0200maerwald(~maerwald@mail.hasufell.de) (Changing host)
2022-06-22 21:03:31 +0200maerwald(~maerwald@user/maerwald)
2022-06-22 21:08:21 +0200pleo(~pleo@user/pleo)
2022-06-22 21:13:58 +0200notzmv(~zmv@user/notzmv)
2022-06-22 21:20:57 +0200machinedgod(~machinedg@66.244.246.252) (Ping timeout: 248 seconds)
2022-06-22 21:21:12 +0200odnes(~odnes@5-203-220-108.pat.nym.cosmote.net) (Quit: Leaving)
2022-06-22 21:21:37 +0200machinedgod(~machinedg@66.244.246.252)
2022-06-22 21:22:18 +0200eggplantade(~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e)
2022-06-22 21:26:49 +0200eggplantade(~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) (Ping timeout: 248 seconds)
2022-06-22 21:27:25 +0200kannon(~NK@135-180-47-54.fiber.dynamic.sonic.net)
2022-06-22 21:28:14 +0200 <Franciman> sm: thank you very much for the podcast link, i'm enjoying it a lot
2022-06-22 21:30:39 +0200 <shapr> is there some way a running haskell binary can ask cabal for the modules in the library stanza?
2022-06-22 21:30:45 +0200szkl(uid110435@id-110435.uxbridge.irccloud.com)
2022-06-22 21:30:49 +0200 <shapr> I should probably move this to #haskell-in-depth again
2022-06-22 21:31:25 +0200eggplantade(~Eggplanta@108-201-191-115.lightspeed.sntcca.sbcglobal.net)
2022-06-22 21:33:21 +0200 <monochrom> A running haskell binary may be running on a computer that has no cabal in the first place.
2022-06-22 21:34:02 +0200 <shapr> yeah, true
2022-06-22 21:34:34 +0200kimjetwav(~user@2607:fea8:2340:da00:59a1:33be:cb76:515a)
2022-06-22 21:34:35 +0200dumptruckman(~dumptruck@45-79-173-88.ip.linodeusercontent.com) (Quit: ZNC - https://znc.in)
2022-06-22 21:35:10 +0200 <int-e> For installed libraries, `ghc-pkg describe` has that kind of information.
2022-06-22 21:35:15 +0200 <shapr> oh interesting
2022-06-22 21:36:15 +0200 <geekosaur[m]> But a running program doesn't even know what libraries it's using
2022-06-22 21:36:32 +0200 <int-e> sure, I shifted the goalpost to somewhere reachable
2022-06-22 21:36:34 +0200 <shapr> I could shell out, but it's probably more trouble than it's worth at this stage
2022-06-22 21:36:42 +0200 <kannon> hi, in this program, why the main in the if/else clause? It works the same without it: https://paste.tomsmeding.com/WYmwl13U
2022-06-22 21:37:30 +0200 <kannon> edited: https://paste.tomsmeding.com/o8LJNdMJ
2022-06-22 21:38:56 +0200 <int-e> hmm that's missing a "="
2022-06-22 21:39:15 +0200 <int-e> ...so if by "it works the same" you mean that neither version is working...
2022-06-22 21:39:42 +0200 <kannon> sorry yeah second edit https://paste.tomsmeding.com/XgSOUg7v
2022-06-22 21:39:48 +0200 <int-e> but the idea here is to start over, asking for another line of input when the last input wasn't "quit"
2022-06-22 21:39:51 +0200 <Maxdamantus> 00:50:20 < maerwald> what's the gain
2022-06-22 21:40:13 +0200 <monochrom> I mean why not? This is just plain recursion expressing a plain loop.
2022-06-22 21:40:53 +0200 <Maxdamantus> maerwald: I don't think it requires any of the things you listed (from the user of Haskell), but I guess to summarise the overall gain, it means that in general, it should be harder for code to be incorrect at handling data.
2022-06-22 21:40:55 +0200 <int-e> So I'm not sure in which way the behavior is the same without that line... maybe this indicates lack of testing :P
2022-06-22 21:41:13 +0200 <kannon> int-e I had it written correctly in ghci. they both worked whether main was in the clause or not..
2022-06-22 21:41:30 +0200 <int-e> "worked"
2022-06-22 21:41:44 +0200 <int-e> well, that may be the case, you didn't supply a specification
2022-06-22 21:41:54 +0200 <monochrom> Yeah I call confirmation bias.
2022-06-22 21:41:59 +0200 <Maxdamantus> maerwald: with the `ShortString` mechanism, someone clever could manually encode their UTF-8 or UTF-16 strings to `ShortString` without going through the proper APIs, and then they'll have code that seems to work on one platform but fails on the other platform even for well-formed Unicode.
2022-06-22 21:42:28 +0200 <kannon> specification ? int-e
2022-06-22 21:42:53 +0200 <int-e> "it works" is essentially devoid of meaning
2022-06-22 21:43:09 +0200 <Maxdamantus> maerwald: and I have a feeling there could be security issues due to the mixing of encoding forms (that is, because `ShortString` sometimes represents UTF-8 and sometimes represents UTF-16).
2022-06-22 21:43:16 +0200 <int-e> because it doesn't say what the expected behavior is
2022-06-22 21:43:46 +0200 <int-e> kannon: http://paste.debian.net/1244895/ <-- this won't work the same way if you drop the call to `main` inside `main`.
2022-06-22 21:45:18 +0200 <kannon> one moment thanks int-e
2022-06-22 21:45:31 +0200 <Maxdamantus> maerwald: I think UTF-8 was pretty much designed around this principle. Unless you're usingh `wchar_t`, it's actually kind of hard to write C code that doesn't handle UTF-8 properly.
2022-06-22 21:45:46 +0200 <maerwald> Maxdamantus: uhm... the UTF-8 roundtripping has security issues
2022-06-22 21:45:57 +0200 <maerwald> see https://unicode.org/L2/L2009/09236-pep383-problems.html
2022-06-22 21:46:04 +0200 <maerwald> and http://blog.omega-prime.co.uk/2011/03/29/security-implications-of-pep-383/
2022-06-22 21:46:05 +0200juri_(~juri@79.140.115.124) (Read error: Connection reset by peer)
2022-06-22 21:46:18 +0200 <maerwald> if you don't touch the filepath encodings, there are none of those issues
2022-06-22 21:46:20 +0200dumptruckman(~dumptruck@23-239-13-163.ip.linodeusercontent.com)
2022-06-22 21:47:12 +0200 <maerwald> also: broken serialisation, broken equality checks, etc.
2022-06-22 21:47:15 +0200juri_(~juri@79.140.115.124)
2022-06-22 21:47:21 +0200juri_(~juri@79.140.115.124) (Read error: Connection reset by peer)
2022-06-22 21:47:54 +0200 <Maxdamantus> maerwald: presumably you would be talking about security issues in my solution (using WTF-8) specifically on Windows?
2022-06-22 21:48:06 +0200 <maerwald> no, I'm talking about PEP 383
2022-06-22 21:48:08 +0200 <Maxdamantus> on Linux the conversion is a no-op.
2022-06-22 21:48:21 +0200 <maerwald> which current Haskell code is using
2022-06-22 21:48:22 +0200mc47(~mc47@xmonad/TheMC47)
2022-06-22 21:48:29 +0200dlbh^(~dlbh@50.237.44.186) (Ping timeout: 256 seconds)
2022-06-22 21:48:44 +0200 <maerwald> however, *without* enforcing UTF-8
2022-06-22 21:49:20 +0200 <maerwald> and PEP 383 doesn't work for every encoding. It's only "total" under fully roundtrippable encodings and those that are ASCII supersets
2022-06-22 21:50:05 +0200 <maerwald> the alternative would be forcing UTF-8 for all haskell code... then all your non-UTF8 filepaths have odd representations in Haskell
2022-06-22 21:50:21 +0200 <maerwald> but they would at least be roundtrippable
2022-06-22 21:50:34 +0200 <kannon> int-e: thanks I see the difference. cheers
2022-06-22 21:50:35 +0200 <maerwald> but now you lost the original encoding, lol
2022-06-22 21:50:43 +0200 <Maxdamantus> bytes are always roundtrippable, because there's no conversion.
2022-06-22 21:50:48 +0200 <Maxdamantus> gtg
2022-06-22 21:51:09 +0200 <maerwald> Maxdamantus: no,they are not
2022-06-22 21:51:15 +0200 <Maxdamantus> or do you mean roundtrippable from [Char]?
2022-06-22 21:51:34 +0200 <maerwald> https://peps.python.org/pep-0383/
2022-06-22 21:51:37 +0200 <Maxdamantus> I think conversion from [Char] to filenames should just emit replacement characters on error.
2022-06-22 21:51:49 +0200 <Maxdamantus> (eg, when using the PEP-383 encoding)
2022-06-22 21:51:54 +0200 <Maxdamantus> I'm familiar with PEP-383.
2022-06-22 21:52:16 +0200juri_(~juri@79.140.115.124)
2022-06-22 21:53:02 +0200 <maerwald> the problem now is also that conversion functions running on your filepath have to understand the meaning of those PEP-383 high surrogate pairs
2022-06-22 21:53:08 +0200 <maerwald> or they might create security bugs
2022-06-22 21:54:25 +0200 <maerwald> PEP-383 is only safe, if the user does nothing with the filepaths, but just passes them around
2022-06-22 21:56:32 +0200 <maerwald> I dunno... why not just stop messing with them :p
2022-06-22 21:56:39 +0200kannon(~NK@135-180-47-54.fiber.dynamic.sonic.net) (Quit: leaving)
2022-06-22 21:57:04 +0200 <EvanR> formal abstract filepath algebra
2022-06-22 21:57:27 +0200 <EvanR> filepath semigroupoids
2022-06-22 21:57:52 +0200 <EvanR> don't worry about what they are, only worry about where they go
2022-06-22 21:58:09 +0200 <Maxdamantus> maerwald: this only applies to conversion from [Char], which should emit replacement characters for surrogate Char values.
2022-06-22 21:58:38 +0200 <maerwald> Maxdamantus: huh?
2022-06-22 21:58:51 +0200juri_(~juri@79.140.115.124) (Read error: Connection reset by peer)
2022-06-22 21:58:57 +0200 <maerwald> if you emit replacement char, you break the semantics
2022-06-22 21:59:10 +0200juri_(~juri@79.140.115.124)
2022-06-22 21:59:11 +0200 <maerwald> you might even delete a wrong file :p
2022-06-22 21:59:40 +0200 <Maxdamantus> you can't in general round trip with [Char] to filenames.
2022-06-22 22:00:00 +0200 <Maxdamantus> If you could, we could just continue using that for representing filenames.
2022-06-22 22:00:06 +0200 <maerwald> roundtripping is well defined for UTF-8 with PEP 383
2022-06-22 22:00:14 +0200 <maerwald> you can roundtrip any bytestring through that afaik
2022-06-22 22:00:31 +0200 <EvanR> (but how do you utf-8 encode a surrogate Char)
2022-06-22 22:00:54 +0200 <EvanR> or is that an obvious
2022-06-22 22:01:47 +0200Infinite9(~Infinite@2405:204:5381:d6e2:c147:f74f:65d9:3fcf) (Quit: Client closed)
2022-06-22 22:02:51 +0200 <Maxdamantus> Well, you could do it that way, but that change would probably break current Haskell code.
2022-06-22 22:03:42 +0200juri_(~juri@79.140.115.124) (Ping timeout: 264 seconds)
2022-06-22 22:03:45 +0200dlbh^(~dlbh@50.237.44.186)
2022-06-22 22:04:30 +0200juri_(~juri@84-19-175-179.pool.ovpn.com)
2022-06-22 22:04:31 +0200 <Maxdamantus> I think your method results in security issues because of the different representations of well-formed Unicode.
2022-06-22 22:04:50 +0200 <maerwald> Maxdamantus: there's no unicode in abstract filepath.
2022-06-22 22:05:31 +0200eggplantade(~Eggplanta@108-201-191-115.lightspeed.sntcca.sbcglobal.net) (Remote host closed the connection)
2022-06-22 22:05:33 +0200superbil(~superbil@1-34-176-171.hinet-ip.hinet.net) (Ping timeout: 265 seconds)
2022-06-22 22:06:03 +0200 <Maxdamantus> eg, "à" is sometimes represented as [0x00c3, 0x00a0] and sometimes represented as [0x00e1|
2022-06-22 22:06:19 +0200 <maerwald> Maxdamantus: those are different platforms
2022-06-22 22:07:00 +0200 <Maxdamantus> if someone accidentally produces the wrong coding for the platform, you've got a string that is equal to the interpretation of ill-formed unicode.
2022-06-22 22:07:14 +0200 <Maxdamantus> that's where security issues arise.
2022-06-22 22:07:17 +0200 <maerwald> Maxdamantus: I don't understand what that means
2022-06-22 22:08:10 +0200nate4(~nate@98.45.169.16)
2022-06-22 22:08:28 +0200 <maerwald> 1. there's no such thing as "wrong encoding" for abstract filepath, 2. they are distinct across platforms (they don't even have the same constructor)... so it's not even possible to accidentially compare a windows filepath with a unix filepath. That doesn't compile.
2022-06-22 22:09:11 +0200 <maerwald> you'd have to explicitly convert them to ByteString or ShortByteString at which point, the library has no business with what you're doing anymore
2022-06-22 22:09:39 +0200 <Maxdamantus> maerwald: if someone hardcodes [0x00e1] into their program because it works on Windows, then the program is run on Linux, I can match that string by providing the ill-formed UTF-8, <E1>
2022-06-22 22:10:05 +0200 <maerwald> Maxdamantus: how would the user do that?
2022-06-22 22:10:29 +0200 <maerwald> there is no *safe* function to do that
2022-06-22 22:11:22 +0200 <Maxdamantus> maerwald: by providing a filename on Linux that contains thatminvalid UTF-8.
2022-06-22 22:12:10 +0200 <Maxdamantus> maerwald: maybe it's a program that scans through a directory and executes a file if it's called "à"
2022-06-22 22:12:13 +0200 <maerwald> this makes no sense to me... you're saying users can use unsafe API to construct wrong filepaths and then claim that's the fault of the library?
2022-06-22 22:12:24 +0200johnw(~johnw@76-234-69-149.lightspeed.frokca.sbcglobal.net)
2022-06-22 22:12:27 +0200 <maerwald> you can already do that today with string based filepaths by switching encoding in between
2022-06-22 22:13:11 +0200 <monochrom> unsafePerformIO comes from the library. It is the fault of the library. :)
2022-06-22 22:13:14 +0200nate4(~nate@98.45.169.16) (Ping timeout: 268 seconds)
2022-06-22 22:13:36 +0200 <Maxdamantus> maerwald: the admin has other ways ofmpreventing people from making "à" files, but it turns out that that's not the actual name being tested.
2022-06-22 22:13:49 +0200 <EvanR> now I'm imagining an idealized program which has a clean separation between pure code and the OS API, and being allowed to use both linux and windows at will, somehow xD
2022-06-22 22:13:58 +0200 <Maxdamantus> anyway, need to stop typing. on phone on a bus and my hands are really cold.
2022-06-22 22:14:29 +0200 <maerwald> Maxdamantus: I think you should check out the API. You'll see that it isn't easy to do what you're suggesting without either using internal modules or using functions that have the *unsafe* prefix
2022-06-22 22:14:33 +0200 <monochrom> I thought the phone would be hot enough to warm your hands. Mine does.
2022-06-22 22:15:04 +0200 <geekosaur> if you're relying on a unix filename being utf-8 you are sinning anyway
2022-06-22 22:15:29 +0200 <monochrom> Yikes, I rely on utf-8 unix filenames all the time...
2022-06-22 22:15:53 +0200 <maerwald> EvanR: I'm not sure about your proposal, I'll have to try a few examples
2022-06-22 22:16:00 +0200 <monochrom> I have some Chinese filenames. Not going back to Big5. :)
2022-06-22 22:16:01 +0200 <geekosaur> if yiou control those names it may be a safe assumption. until your backup program assumes latin-1…
2022-06-22 22:16:30 +0200 <monochrom> Ah, true. Now I need to check that duplicity doesn't break my backup :)
2022-06-22 22:16:31 +0200 <maerwald> EvanR: you mean basically bytestring that have all sorts of random surrogate chars... and whether pep 383 will choke on it?
2022-06-22 22:16:50 +0200 <EvanR> maerwald, what I'm thinking of is impossible in practice... programs run on 1 OS at a time
2022-06-22 22:17:00 +0200 <EvanR> as far as I know
2022-06-22 22:17:13 +0200 <geekosaur> until windows decides to integrate wsl better
2022-06-22 22:17:26 +0200 <geekosaur> (or goes back to the old posix subsystem stuff)
2022-06-22 22:17:39 +0200 <maerwald> I ran a property test over the UTF-8 roundtrip encoding feeding it random bytestrings... it always roundtripped
2022-06-22 22:18:06 +0200 <EvanR> I think issues with filepath exist way before we have such tech
2022-06-22 22:19:12 +0200superbil(~superbil@1-34-176-171.hinet-ip.hinet.net)
2022-06-22 22:19:23 +0200werneta(~werneta@137.78.30.207)
2022-06-22 22:19:58 +0200z0k(~z0k@206.84.141.12) (Ping timeout: 240 seconds)
2022-06-22 22:20:06 +0200_ht(~quassel@231-169-21-31.ftth.glasoperator.nl) (Remote host closed the connection)
2022-06-22 22:20:12 +0200 <monochrom> EvanR: Continuing your crazy plan, we can re-define RPC to mean "relayed process control" meaning that you run a program on Windows and then you just suspend it and send its memory dump to a Linux host and resume running there. >:)
2022-06-22 22:20:30 +0200 <EvanR> ah that might be a way
2022-06-22 22:20:39 +0200 <geekosaur> isn't that where llvm came from?
2022-06-22 22:20:49 +0200 <monochrom> Oh haha
2022-06-22 22:20:49 +0200 <geekosaur> supercomputers want that tech
2022-06-22 22:20:55 +0200eggplantade(~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e)
2022-06-22 22:21:28 +0200 <monochrom> But blame it on ST:TNG for giving me that idea multiple times with its holodeck tricks.
2022-06-22 22:21:52 +0200 <monochrom> holodeck+beaming tricks
2022-06-22 22:22:23 +0200 <EvanR> moriarty I dare you to walk out that door
2022-06-22 22:22:23 +0200shriekingnoise(~shrieking@201.212.175.181) (Quit: Quit)
2022-06-22 22:22:29 +0200 <EvanR> no problem
2022-06-22 22:22:31 +0200 <monochrom> Heh
2022-06-22 22:22:44 +0200shriekingnoise(~shrieking@201.212.175.181)
2022-06-22 22:23:22 +0200 <geekosaur> I stil want to know what numbskull didn't completely isolate those systems…
2022-06-22 22:23:23 +0200 <EvanR> cogito ergo sum
2022-06-22 22:23:48 +0200zeenk(~zeenk@2a02:2f04:a301:3d00:39df:1c4b:8a55:48d3)
2022-06-22 22:24:02 +0200 <EvanR> TNG's optimistic future has lax computer security
2022-06-22 22:24:09 +0200 <monochrom> I'm a great tautologist. I'll one-up Descarte with: cogito ergo cogito.
2022-06-22 22:24:17 +0200 <k`> Well, look, sometimes when we get transporter transmissions we get data that can't be encoded as physical matter. So we send it to the holodeck and see what it looks like...
2022-06-22 22:24:21 +0200bitdex(~bitdex@gateway/tor-sasl/bitdex) (Ping timeout: 268 seconds)
2022-06-22 22:24:51 +0200 <monochrom> Clearly Data is encoded as physical matter. >:)
2022-06-22 22:25:55 +0200 <k`> You really don't want to create an anti-Riker just because a few bits got flipped in the transporter. But at the same time, you don't want to drop all his information because it's invalid.
2022-06-22 22:26:17 +0200 <geekosaur> that's what ecc is for
2022-06-22 22:26:25 +0200jmdaemon(~jmdaemon@user/jmdaemon)
2022-06-22 22:26:34 +0200 <EvanR> anti-riker is impossible, how would you add a beard (because we shall not speak of anything featuring him without a beard)
2022-06-22 22:26:36 +0200 <geekosaur> and fec, etc.
2022-06-22 22:26:50 +0200 <k`> I think the Federation uses a patented Grey encoding.
2022-06-22 22:27:33 +0200bitdex(~bitdex@gateway/tor-sasl/bitdex)
2022-06-22 22:27:55 +0200yrlnry(~yrlnry@pool-108-2-150-109.phlapa.fios.verizon.net) (Remote host closed the connection)
2022-06-22 22:28:06 +0200 <EvanR> wait is the transporter protocols and inevitable dramatic failures actually relevant to Filepath after all
2022-06-22 22:28:21 +0200 <monochrom> Sorry!
2022-06-22 22:28:47 +0200yrlnry(~yrlnry@pool-108-2-150-109.phlapa.fios.verizon.net)
2022-06-22 22:29:10 +0200 <monochrom> But I guess inevitable dramatic failures in general are relevant to everything including file paths.
2022-06-22 22:29:30 +0200 <k`> Is `git-annex` creating beardless Rikers?
2022-06-22 22:29:43 +0200 <monochrom> The whole point why people are talking about inevitable dramatic failures when using PEP-383 for file paths.
2022-06-22 22:30:25 +0200 <monochrom> I think no matter what you use for file paths, you will have dramatic failures.
2022-06-22 22:30:55 +0200mikoto-chan(~mikoto-ch@esm-84-240-99-143.netplaza.fi)
2022-06-22 22:31:12 +0200 <maerwald> the problem with PEP-383 is: 1. you lose the original encoding 2. it actually produces invalid UTF-8 in the strict sense
2022-06-22 22:31:25 +0200 <maerwald> so if you run a strict UTF-8 converter over it, it fails
2022-06-22 22:31:54 +0200 <maerwald> https://gist.github.com/hasufell/c600d318bdbe010a7841cc351c835f92#failure-6-re-encoding-pep-383-ut…
2022-06-22 22:32:42 +0200 <maerwald> that's not a great property to have lol
2022-06-22 22:32:53 +0200 <k`> maerwald: The whole point of 383 is to not use strict UTF8 and to have a reversible encoder so you never lose the original encoding.
2022-06-22 22:32:55 +0200mc47(~mc47@xmonad/TheMC47) (Remote host closed the connection)
2022-06-22 22:32:58 +0200yrlnry(~yrlnry@pool-108-2-150-109.phlapa.fios.verizon.net) (Ping timeout: 240 seconds)
2022-06-22 22:33:54 +0200 <maerwald> k`: yes, but it doesn't work well
2022-06-22 22:33:57 +0200 <monochrom> So in Haskell or any language with sum types, we can go like "data Path = Decodable [Char] | Undecodable ByteString", and neither case distorts the data.
2022-06-22 22:34:14 +0200 <monochrom> (Replace [] by any efficient sequence container type you like)
2022-06-22 22:34:33 +0200 <monochrom> Things like PEP-383 are invented by people who are afraid of sum types.
2022-06-22 22:34:38 +0200 <maerwald> k`: a call to `setFileSystemEncoding` can make it fail... serializing the String is unsafe, etc.
2022-06-22 22:34:52 +0200 <monochrom> or even the class-subclass encoding of sum types.
2022-06-22 22:34:58 +0200 <EvanR> how about | Unencodable [Char] xD
2022-06-22 22:35:13 +0200lyle(~lyle@104.246.145.85) (Quit: WeeChat 3.5)
2022-06-22 22:35:19 +0200 <geekosaur> Char assumes Unicode codepoints. [Word8]
2022-06-22 22:35:26 +0200 <k`> monochrom: So then when you append a Decodable prefix to an Undecodable filename you end up with an Undecodable path?
2022-06-22 22:35:31 +0200 <geekosaur> which makes it just an inefficient ByteString
2022-06-22 22:35:41 +0200 <EvanR> the utf16 surrogates...
2022-06-22 22:35:46 +0200coot(~coot@213.134.190.95)
2022-06-22 22:35:56 +0200 <monochrom> I want Unicode codepoints, Haskell Char, in the Decodable case.
2022-06-22 22:35:58 +0200 <EvanR> oh, you meant utf32
2022-06-22 22:35:59 +0200 <geekosaur> k`, what else could you end up with?
2022-06-22 22:36:38 +0200 <monochrom> I think we should not allow that appending.
2022-06-22 22:36:57 +0200 <geekosaur> probably the most correct solution
2022-06-22 22:37:21 +0200 <EvanR> how can you deny the power of the /
2022-06-22 22:37:22 +0200 <geekosaur> (granting that someone will want it, but in that case they should provide a Undecodable prefix)
2022-06-22 22:38:17 +0200 <monochrom> Unpopular opinion: The whole point of PEP-383 is avoiding real sum types and rolling your own tagging.
2022-06-22 22:38:25 +0200 <EvanR> if you have two valid paths, how could / not join them xD
2022-06-22 22:38:45 +0200 <monochrom> Right? Use a high surrogate as tag for "I can't decode this byte, here is the byte itself"
2022-06-22 22:38:59 +0200 <monochrom> In Haskell land we call it "Either Char Word8"
2022-06-22 22:43:46 +0200 <k`> So is the string undecodable after the bad byte or just at the bad byte?
2022-06-22 22:44:09 +0200 <monochrom> I guess my idea still doesn't answer the question of comparing a decodable path with an undecodable path, the latter being undecodable just because of misfortunate locale settings.
2022-06-22 22:44:50 +0200 <monochrom> My idea declares the whole path undecodable. PEP-383 declares individual bytes undecodable.
2022-06-22 22:45:26 +0200 <monochrom> I think there is no answer to that question.
2022-06-22 22:45:40 +0200 <maerwald> the answer is: don't decode if you don't have to :p
2022-06-22 22:45:51 +0200 <maerwald> and most of the time, you actually don't
2022-06-22 22:46:07 +0200 <monochrom> Ah, right, I can stand behind that.
2022-06-22 22:46:10 +0200 <maerwald> e.g. you don't need to understand the filename encoding when splitting filepaths
2022-06-22 22:46:19 +0200 <maerwald> because the separator char '/' is well defined
2022-06-22 22:46:24 +0200 <maerwald> and not encoding specific
2022-06-22 22:46:30 +0200 <maerwald> you just scan and split, ignoring the rest
2022-06-22 22:47:42 +0200nate4(~nate@98.45.169.16)
2022-06-22 22:47:42 +0200takuan(~takuan@178-116-218-225.access.telenet.be) (Remote host closed the connection)
2022-06-22 22:47:48 +0200 <k`> You also need to check for an escaped '/', but I see what you're saying.
2022-06-22 22:48:13 +0200 <monochrom> I think I haven't seen a file system that provides for an escaped /
2022-06-22 22:48:14 +0200 <maerwald> k`: so would you rather see Haskell enforcing UTF-8 so that PEP 383 actually works *all the time*?
2022-06-22 22:48:42 +0200 <k`> maerwald: Yes.
2022-06-22 22:48:43 +0200 <maerwald> that would mean to ignore locale
2022-06-22 22:48:48 +0200 <monochrom> or windows providing for an escaped \
2022-06-22 22:50:44 +0200 <EvanR> on mac typing / into the filename causes a fancy phantom / character from the astral plane to be used
2022-06-22 22:50:52 +0200 <geekosaur> iirc namei() or equivalent is not per filesystem so there is no way to escape / regardless of filesystem
2022-06-22 22:51:06 +0200 <geekosaur> back in the day that was converted to :
2022-06-22 22:51:19 +0200 <geekosaur> and similarly : to / (whee ancient macos)
2022-06-22 22:51:21 +0200 <EvanR> ascii / means /
2022-06-22 22:51:36 +0200 <maerwald> k`: why not simply avoid roundtripping?
2022-06-22 22:53:05 +0200Pickchea(~private@user/pickchea) (Ping timeout: 256 seconds)
2022-06-22 22:54:04 +0200jgeerds(~jgeerds@55d45f48.access.ecotel.net)
2022-06-22 22:56:02 +0200eggplantade(~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) (Remote host closed the connection)
2022-06-22 22:59:50 +0200Tuplanolla(~Tuplanoll@91-159-69-97.elisa-laajakaista.fi)
2022-06-22 23:01:44 +0200eggplantade(~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e)
2022-06-22 23:04:11 +0200bitdex(~bitdex@gateway/tor-sasl/bitdex) (Remote host closed the connection)
2022-06-22 23:05:27 +0200coot(~coot@213.134.190.95) (Quit: coot)
2022-06-22 23:06:21 +0200bitdex(~bitdex@gateway/tor-sasl/bitdex)
2022-06-22 23:13:39 +0200pleo(~pleo@user/pleo) (Read error: Connection reset by peer)
2022-06-22 23:14:01 +0200pleo(~pleo@user/pleo)
2022-06-22 23:14:36 +0200MironZ3(~MironZ@nat-infra.ehlab.uk)
2022-06-22 23:14:55 +0200yrlnry(~yrlnry@pool-108-2-150-109.phlapa.fios.verizon.net)
2022-06-22 23:14:57 +0200shriekingnoise(~shrieking@201.212.175.181) (Quit: Quit)
2022-06-22 23:15:16 +0200shriekingnoise(~shrieking@201.212.175.181)
2022-06-22 23:15:59 +0200MironZ(~MironZ@nat-infra.ehlab.uk) (Quit: Ping timeout (120 seconds))
2022-06-22 23:15:59 +0200MironZ3MironZ
2022-06-22 23:20:39 +0200rendar(~Paxman@user/rendar) (Quit: Leaving)
2022-06-22 23:25:23 +0200mikoto-chan(~mikoto-ch@esm-84-240-99-143.netplaza.fi) (Ping timeout: 256 seconds)
2022-06-22 23:25:59 +0200mikoto-chan(~mikoto-ch@esm-84-240-99-143.netplaza.fi)
2022-06-22 23:28:28 +0200misterfish(~misterfis@ip214-130-173-82.adsl2.static.versatel.nl) (Ping timeout: 268 seconds)
2022-06-22 23:28:59 +0200liz(~liz@host86-159-158-175.range86-159.btcentralplus.com)
2022-06-22 23:35:01 +0200mikoto-chan(~mikoto-ch@esm-84-240-99-143.netplaza.fi) (Ping timeout: 244 seconds)
2022-06-22 23:35:39 +0200bilegeek(~bilegeek@2600:1008:b06f:8528:b8b4:9bf9:3a8:ef97)
2022-06-22 23:45:18 +0200__monty__(~toonn@user/toonn) (Quit: leaving)
2022-06-22 23:46:23 +0200 <tomsmeding> I believe it's still converted to : nowadays if you enter a / in Finder, or at least that worked a few years ago still
2022-06-22 23:46:34 +0200 <tomsmeding> also /
2022-06-22 23:47:43 +0200 <EvanR> yeah that thing
2022-06-22 23:47:58 +0200 <EvanR> proprietary solidus
2022-06-22 23:49:11 +0200nate4(~nate@98.45.169.16) (Ping timeout: 256 seconds)
2022-06-22 23:51:14 +0200Qudit(~user@user/Qudit) (Remote host closed the connection)
2022-06-22 23:52:46 +0200 <tomsmeding> スラッシュ
2022-06-22 23:53:06 +0200eggplantade(~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) (Remote host closed the connection)
2022-06-22 23:53:26 +0200justsomeguy(~justsomeg@user/justsomeguy)
2022-06-22 23:54:11 +0200gmg(~user@user/gehmehgeh) (Quit: Leaving)
2022-06-22 23:57:31 +0200michalz(~michalz@185.246.204.107) (Remote host closed the connection)