2022-06-22 00:02:07 +0200 | __monty__ | (~toonn@user/toonn) (Quit: leaving) |
2022-06-22 00:05:06 +0200 | winny | (~weechat@user/winny) |
2022-06-22 00:06:06 +0200 | bontaq | (~user@ool-45779fe5.dyn.optonline.net) (Ping timeout: 276 seconds) |
2022-06-22 00:06:23 +0200 | michalz | (~michalz@185.246.204.107) (Remote host closed the connection) |
2022-06-22 00:11:07 +0200 | takuan | (~takuan@178-116-218-225.access.telenet.be) (Remote host closed the connection) |
2022-06-22 00:14:09 +0200 | n1essa | (~nessa@75-164-218-34.ptld.qwest.net) (Quit: leaving) |
2022-06-22 00:15:04 +0200 | k8yun | (~k8yun@user/k8yun) (Read error: Connection reset by peer) |
2022-06-22 00:15:42 +0200 | acidjnk_new | (~acidjnk@dynamic-046-114-169-114.46.114.pool.telefonica.de) (Ping timeout: 264 seconds) |
2022-06-22 00:17:05 +0200 | odnes | (~odnes@5-203-249-68.pat.nym.cosmote.net) (Remote host closed the connection) |
2022-06-22 00:19:14 +0200 | money | (~Gambino@user/polo) |
2022-06-22 00:20:27 +0200 | rito_ | (~rito_gh@45.112.243.199) (Ping timeout: 256 seconds) |
2022-06-22 00:25:37 +0200 | yauhsien | (~yauhsien@61-231-23-53.dynamic-ip.hinet.net) |
2022-06-22 00:29:50 +0200 | yauhsien | (~yauhsien@61-231-23-53.dynamic-ip.hinet.net) (Ping timeout: 240 seconds) |
2022-06-22 00:30:33 +0200 | mikoto-chan | (~mikoto-ch@esm-84-240-99-143.netplaza.fi) |
2022-06-22 00:31:20 +0200 | moonsheep | (~user@user/moonsheep) |
2022-06-22 00:31:30 +0200 | <moonsheep> | Hi there again! |
2022-06-22 00:31:43 +0200 | <moonsheep> | I'm trying to install accelerate, and I've installed llvm 9 from source. |
2022-06-22 00:32:05 +0200 | <moonsheep> | Now when I try to build my project, accelerate-llvm reports the following: `<command line>: libLLVMXRay.so.9: cannot open shared object file: No such file or directory` |
2022-06-22 00:32:24 +0200 | <moonsheep> | Yet if I go look under /usr/local/lib it is clearly there |
2022-06-22 00:32:52 +0200 | <moonsheep> | `llvm-config --libdir` does indeed return `/usr/local/lib` |
2022-06-22 00:34:17 +0200 | money | (~Gambino@user/polo) (Read error: Connection reset by peer) |
2022-06-22 00:37:30 +0200 | Guest3106 | (~Gambino@user/polo) |
2022-06-22 00:38:10 +0200 | jgeerds | (~jgeerds@55d45f48.access.ecotel.net) (Ping timeout: 240 seconds) |
2022-06-22 00:38:13 +0200 | <geekosaur> | llvm-config won't help here, either /usr/local/lib needs to be listed in /etc/ld.so.conf (or under /etc/ld.so.conf.d, on ubuntu) or you need to arrange for accelerate-llvm to use -R |
2022-06-22 00:38:40 +0200 | eggplantade | (~Eggplanta@2600:1700:bef1:5e10:9:59a4:5055:fd8c) (Remote host closed the connection) |
2022-06-22 00:38:51 +0200 | Guest3106 | (~Gambino@user/polo) (Remote host closed the connection) |
2022-06-22 00:39:39 +0200 | <monochrom> | And you probably still need to run "sudo ldconfig" because it is /etc/ld.so.cache that is consulted at run time, rather that a real search. |
2022-06-22 00:40:20 +0200 | Polo__ | (~Gambino@user/polo) |
2022-06-22 00:40:35 +0200 | Polo__ | (~Gambino@user/polo) (Read error: Connection reset by peer) |
2022-06-22 00:42:30 +0200 | <geekosaur> | yes |
2022-06-22 00:42:36 +0200 | <geekosaur> | (sorry, making dinner) |
2022-06-22 00:42:43 +0200 | money | (~Gambino@user/polo) |
2022-06-22 00:43:08 +0200 | <monochrom> | Ah, I'm a spoiled kid, I just order through Ubereats and continue to IRC :) |
2022-06-22 00:45:08 +0200 | Guest9447 | (~Gambino@user/polo) |
2022-06-22 00:45:31 +0200 | Guest9447 | (~Gambino@user/polo) (Client Quit) |
2022-06-22 00:46:25 +0200 | <EvanR> | dammit don't temp me |
2022-06-22 00:46:32 +0200 | jmdaemon | (~jmdaemon@user/jmdaemon) (Quit: ZNC 1.8.2 - https://znc.in) |
2022-06-22 00:47:29 +0200 | money | (~Gambino@user/polo) (Ping timeout: 246 seconds) |
2022-06-22 00:47:39 +0200 | jmdaemon | (~jmdaemon@user/jmdaemon) |
2022-06-22 00:49:17 +0200 | <moonsheep> | geekosaur: I did that doesn't seem to have any effect |
2022-06-22 00:49:31 +0200 | <moonsheep> | My /etc/ld.so.conf just loads all the files under /etc/ld.so.conf.d |
2022-06-22 00:49:42 +0200 | <moonsheep> | I added one that has /usr/local/lib |
2022-06-22 00:49:57 +0200 | alp__ | (~alp@user/alp) (Ping timeout: 268 seconds) |
2022-06-22 00:50:12 +0200 | <geekosaur> | is it named <whatever>.conf? and did you run `sudo ldconfig` afterward like monochrom said? |
2022-06-22 00:50:14 +0200 | <moonsheep> | And still accelerate fails to build |
2022-06-22 00:50:27 +0200 | <moonsheep> | It is named llvm9.conf and yes I did |
2022-06-22 00:50:38 +0200 | <moonsheep> | I even tried manually removing the cache file but it didn't seem to help |
2022-06-22 00:50:55 +0200 | <geekosaur> | uh., that sounds like a good way to break your system |
2022-06-22 00:50:55 +0200 | <moonsheep> | Oh wait my bad, I'm blin |
2022-06-22 00:51:01 +0200 | <moonsheep> | It's a different error |
2022-06-22 00:51:19 +0200 | <moonsheep> | [a very long path]-ghc8.10.7.so: undefined symbol: _ZTIN4llvm13ErrorInfoBaseE |
2022-06-22 00:51:30 +0200 | <moonsheep> | I am supposed to use llvm 9.0.1 right? |
2022-06-22 00:52:08 +0200 | <moonsheep> | So I guess now it can find the llibrary but it fails to link with ti |
2022-06-22 00:52:56 +0200 | <geekosaur> | sounds like it, but the question is what is failing to link with it |
2022-06-22 00:53:12 +0200 | <geekosaur> | "a very long path" minus the long hash at the end |
2022-06-22 00:53:55 +0200 | <moonsheep> | /home/moonsheep/.stack/snapshots/x86_64-linux-tinfo6/<hash>/8.10.7/lib/x86_64-linux-ghc-8.10.7/libHSllvm-hs-9.0.1-S639BV4lBwDq2AVMyPWFd-ghc8.10.7.so: undefined symbol: _ZTIN4llvm13ErrorInfoBaseE |
2022-06-22 00:54:38 +0200 | gurkenglas | (~gurkengla@dslb-002-207-014-022.002.207.pools.vodafone-ip.de) (Ping timeout: 244 seconds) |
2022-06-22 00:55:10 +0200 | <geekosaur> | odd. I think you need someone familiar with accelerate-llvm at this point |
2022-06-22 00:55:10 +0200 | zeenk | (~zeenk@2a02:2f04:a301:3d00:39df:1c4b:8a55:48d3) (Quit: Konversation terminated!) |
2022-06-22 00:55:38 +0200 | <moonsheep> | Hmm, maybe I should try purging everything |
2022-06-22 00:55:52 +0200 | jmdaemon | (~jmdaemon@user/jmdaemon) (Quit: ZNC 1.8.2 - https://znc.in) |
2022-06-22 00:55:53 +0200 | cheater | (~Username@user/cheater) (Ping timeout: 248 seconds) |
2022-06-22 00:56:25 +0200 | cheater | (~Username@user/cheater) |
2022-06-22 00:58:00 +0200 | moonsheep | (~user@user/moonsheep) (Remote host closed the connection) |
2022-06-22 00:58:12 +0200 | <geekosaur> | https://discourse.llvm.org/t/lost-ztin4llvm13errorinfobasee-symbol/3077 |
2022-06-22 00:58:46 +0200 | <geekosaur> | sounds like youu need to rebuild llvm with -DENABLE_LLVM_RTTI=ON |
2022-06-22 00:59:00 +0200 | alp__ | (~alp@user/alp) |
2022-06-22 01:00:00 +0200 | <geekosaur> | I thought that nanme looked mangled but I wasn't expecting c++ mangling searched for by haskell |
2022-06-22 01:03:59 +0200 | nate4 | (~nate@98.45.169.16) |
2022-06-22 01:04:41 +0200 | <geekosaur> | oh, they left |
2022-06-22 01:05:12 +0200 | <geekosaur> | @tell moonsheep per https://discourse.llvm.org/t/lost-ztin4llvm13errorinfobasee-symbol/3077 you need to rebuild llvm with -DENABLE_LLVM_RTTI=ON |
2022-06-22 01:05:13 +0200 | <lambdabot> | Consider it noted. |
2022-06-22 01:06:40 +0200 | chomwitt | (~chomwitt@2a02:587:dc0d:e600:d03e:b48f:9497:fc81) (Remote host closed the connection) |
2022-06-22 01:08:22 +0200 | moonsheep | (~user@user/moonsheep) |
2022-06-22 01:08:41 +0200 | nate4 | (~nate@98.45.169.16) (Ping timeout: 248 seconds) |
2022-06-22 01:08:52 +0200 | <moonsheep> | geekosaur: ah thanks I'll try that now |
2022-06-22 01:08:58 +0200 | <moonsheep> | Yeah sorry for leaving I tried rebooting |
2022-06-22 01:09:37 +0200 | <geekosaur[m]> | No worries |
2022-06-22 01:09:42 +0200 | stiell | (~stiell@gateway/tor-sasl/stiell) (Ping timeout: 268 seconds) |
2022-06-22 01:11:46 +0200 | <moonsheep> | Hmm, cmake tells me that CMake Warning: |
2022-06-22 01:11:46 +0200 | <moonsheep> | Manually-specified variables were not used by the project: |
2022-06-22 01:11:46 +0200 | <moonsheep> | ENABLE_LLVM_RTTI |
2022-06-22 01:11:54 +0200 | <moonsheep> | Oops didn't mean to paste like that |
2022-06-22 01:13:01 +0200 | <moonsheep> | Ah it's actually called `LLVM_ENABLE_RTTI` |
2022-06-22 01:21:33 +0200 | eggplantade | (~Eggplanta@2600:1700:bef1:5e10:9:59a4:5055:fd8c) |
2022-06-22 01:23:10 +0200 | Lord_of_Life | (~Lord@user/lord-of-life/x-2819915) (Ping timeout: 240 seconds) |
2022-06-22 01:23:22 +0200 | Lord_of_Life_ | (~Lord@user/lord-of-life/x-2819915) |
2022-06-22 01:23:27 +0200 | quarkyalice | (~quarkyali@user/quarkyalice) (Quit: quarkyalice) |
2022-06-22 01:24:36 +0200 | Lord_of_Life_ | Lord_of_Life |
2022-06-22 01:26:46 +0200 | quarkyalice | (~quarkyali@user/quarkyalice) |
2022-06-22 01:27:00 +0200 | AlexNoo_ | (~AlexNoo@178.34.160.206) |
2022-06-22 01:29:01 +0200 | AlexZenon | (~alzenon@94.233.240.20) (Ping timeout: 256 seconds) |
2022-06-22 01:29:34 +0200 | Tuplanolla | (~Tuplanoll@91-159-69-97.elisa-laajakaista.fi) (Quit: Leaving.) |
2022-06-22 01:30:43 +0200 | Alex_test | (~al_test@94.233.240.20) (Ping timeout: 256 seconds) |
2022-06-22 01:30:43 +0200 | AlexNoo | (~AlexNoo@94.233.240.20) (Ping timeout: 256 seconds) |
2022-06-22 01:31:21 +0200 | BusConscious | (~martin@ip5f5bdedc.dynamic.kabel-deutschland.de) (Remote host closed the connection) |
2022-06-22 01:32:19 +0200 | stackdroid18 | (14094@user/stackdroid) |
2022-06-22 01:32:52 +0200 | AlexZenon | (~alzenon@178.34.160.206) |
2022-06-22 01:32:54 +0200 | stiell | (~stiell@gateway/tor-sasl/stiell) |
2022-06-22 01:33:00 +0200 | money | (~Gambino@user/polo) |
2022-06-22 01:34:34 +0200 | Alex_test | (~al_test@178.34.160.206) |
2022-06-22 01:37:00 +0200 | mixfix41 | (~sdenynine@user/mixfix41) |
2022-06-22 01:37:31 +0200 | money | (~Gambino@user/polo) () |
2022-06-22 01:38:53 +0200 | pavonia | (~user@user/siracusa) |
2022-06-22 01:40:21 +0200 | tv | (~tv@user/tv) (Ping timeout: 256 seconds) |
2022-06-22 01:46:16 +0200 | <moonsheep> | Oh forgot to report back here: it worked beautifully! |
2022-06-22 01:46:18 +0200 | <moonsheep> | Thank you very much |
2022-06-22 01:46:32 +0200 | <moonsheep> | In case anyone is interested, this is my full cmake command: |
2022-06-22 01:46:33 +0200 | <moonsheep> | cmake .. -DCMAKE_BUILD_TYPE=Release -GNinja -DLLVM_BUILD_LLVM_DYLIB=ON -DLLVM_ENABLE_RTTI=ON -DBUILD_SHARED_LIBS=ON |
2022-06-22 01:46:43 +0200 | <moonsheep> | llvm 9.0.1 |
2022-06-22 01:47:00 +0200 | moonsheep | (~user@user/moonsheep) (ERC 5.4 (IRC client for GNU Emacs 28.1)) |
2022-06-22 01:47:56 +0200 | cosimone | (~user@93-44-186-171.ip98.fastwebnet.it) (Read error: Connection reset by peer) |
2022-06-22 01:49:45 +0200 | k8yun | (~k8yun@user/k8yun) |
2022-06-22 01:52:21 +0200 | rkk | (~rkk@2601:547:b01:53f3:df8:eb6f:ddf9:a41f) |
2022-06-22 01:52:33 +0200 | tv | (~tv@user/tv) |
2022-06-22 01:55:08 +0200 | k8yun | (~k8yun@user/k8yun) (Quit: Leaving) |
2022-06-22 01:55:40 +0200 | rkk | (~rkk@2601:547:b01:53f3:df8:eb6f:ddf9:a41f) (Remote host closed the connection) |
2022-06-22 02:00:09 +0200 | juri__ | (~juri@79.140.115.124) |
2022-06-22 02:01:29 +0200 | [itchyjunk] | (~itchyjunk@user/itchyjunk/x-7353470) (Ping timeout: 248 seconds) |
2022-06-22 02:02:29 +0200 | jmdaemon | (~jmdaemon@user/jmdaemon) |
2022-06-22 02:02:58 +0200 | quarkyalice | (~quarkyali@user/quarkyalice) (Quit: quarkyalice) |
2022-06-22 02:03:05 +0200 | juri_ | (~juri@79.140.115.72) (Ping timeout: 248 seconds) |
2022-06-22 02:05:40 +0200 | [itchyjunk] | (~itchyjunk@user/itchyjunk/x-7353470) |
2022-06-22 02:06:07 +0200 | quarkyalice | (~quarkyali@user/quarkyalice) |
2022-06-22 02:13:58 +0200 | vysn | (~vysn@user/vysn) |
2022-06-22 02:21:04 +0200 | <Axman6> | dsal: isn't that just sequence? |
2022-06-22 02:21:28 +0200 | <Axman6> | > sequence ["ABC","TUV","XYZ"] |
2022-06-22 02:21:30 +0200 | <lambdabot> | ["ATX","ATY","ATZ","AUX","AUY","AUZ","AVX","AVY","AVZ","BTX","BTY","BTZ","BU... |
2022-06-22 02:27:27 +0200 | td_ | (~td@muedsl-82-207-238-103.citykom.de) |
2022-06-22 02:29:28 +0200 | <dsal> | Axman6: It is in that case, I think, but this was groups of three of two things. |
2022-06-22 02:29:42 +0200 | <dsal> | > sequence [[True, False]] |
2022-06-22 02:29:44 +0200 | <lambdabot> | [[True],[False]] |
2022-06-22 02:30:13 +0200 | <dsal> | > sequence [[True], [False]] |
2022-06-22 02:30:15 +0200 | <lambdabot> | [[True,False]] |
2022-06-22 02:30:43 +0200 | <dsal> | I can't make that exciting. |
2022-06-22 02:30:50 +0200 | pleo | (~pleo@user/pleo) (Ping timeout: 240 seconds) |
2022-06-22 02:35:27 +0200 | jmcarthur | (~jmcarthur@c-73-29-224-10.hsd1.nj.comcast.net) |
2022-06-22 02:36:06 +0200 | jmcarthur | (~jmcarthur@c-73-29-224-10.hsd1.nj.comcast.net) (Client Quit) |
2022-06-22 02:38:38 +0200 | pretty_dumm_guy | (trottel@gateway/vpn/protonvpn/prettydummguy/x-88029655) (Ping timeout: 240 seconds) |
2022-06-22 02:40:45 +0200 | esrh | (~user@res404s-128-61-105-50.res.gatech.edu) |
2022-06-22 02:42:12 +0200 | <jackdk> | it alternates, that's exciting! |
2022-06-22 02:46:11 +0200 | <dsal> | > sequence [[True, False]] |
2022-06-22 02:46:13 +0200 | <lambdabot> | [[True],[False]] |
2022-06-22 02:46:17 +0200 | xff0x | (~xff0x@b133147.ppp.asahi-net.or.jp) (Ping timeout: 248 seconds) |
2022-06-22 02:46:28 +0200 | <hpc> | it'd be more fun if it was like that for all inputs |
2022-06-22 02:46:30 +0200 | <dsal> | My attention span is short enough that I just typed up a thing I already typed up to try it. |
2022-06-22 02:46:53 +0200 | <dsal> | > replicateM 3 ["ABC", "TUV", "XYZ"] |
2022-06-22 02:46:55 +0200 | <Axman6> | It could've changed in that time |
2022-06-22 02:46:55 +0200 | <lambdabot> | [["ABC","ABC","ABC"],["ABC","ABC","TUV"],["ABC","ABC","XYZ"],["ABC","TUV","A... |
2022-06-22 02:47:16 +0200 | <dsal> | > replicateM 3 "ABC" -- I guess at this point, it's just permutations of 3 |
2022-06-22 02:47:18 +0200 | <lambdabot> | ["AAA","AAB","AAC","ABA","ABB","ABC","ACA","ACB","ACC","BAA","BAB","BAC","BB... |
2022-06-22 02:50:12 +0200 | quarkyalice | (~quarkyali@user/quarkyalice) (Remote host closed the connection) |
2022-06-22 02:51:21 +0200 | rkk | (~rkk@2601:547:b01:53f3:df8:eb6f:ddf9:a41f) |
2022-06-22 02:51:43 +0200 | quarkyalice | (~quarkyali@user/quarkyalice) |
2022-06-22 02:55:03 +0200 | quarkyalice | (~quarkyali@user/quarkyalice) (Client Quit) |
2022-06-22 02:57:06 +0200 | rkk | (~rkk@2601:547:b01:53f3:df8:eb6f:ddf9:a41f) (Quit: Leaving) |
2022-06-22 02:59:23 +0200 | cheater1__ | (~Username@user/cheater) |
2022-06-22 02:59:30 +0200 | cheater | (~Username@user/cheater) (Ping timeout: 264 seconds) |
2022-06-22 02:59:37 +0200 | cheater1__ | cheater |
2022-06-22 03:01:58 +0200 | machinedgod | (~machinedg@66.244.246.252) (Ping timeout: 240 seconds) |
2022-06-22 03:03:02 +0200 | cheater | (~Username@user/cheater) (Client Quit) |
2022-06-22 03:03:47 +0200 | cheater | (~Username@user/cheater) |
2022-06-22 03:04:02 +0200 | Guest27 | (~Guest27@2601:281:d47f:1590::2df) |
2022-06-22 03:04:32 +0200 | notzmv | (~zmv@user/notzmv) (Ping timeout: 255 seconds) |
2022-06-22 03:09:12 +0200 | moet | (~moet@mobile-166-177-248-235.mycingular.net) |
2022-06-22 03:09:32 +0200 | moet | (~moet@mobile-166-177-248-235.mycingular.net) (Client Quit) |
2022-06-22 03:10:32 +0200 | <Guest27> | Is Cabal supposed to cache ghc options by default? If I run `cabal build --ghc-options -ddump-splices`, future builds will always dump the splices even without any ghc options passed until running `cabal clean` |
2022-06-22 03:18:35 +0200 | nate4 | (~nate@98.45.169.16) |
2022-06-22 03:20:22 +0200 | Guest27 | (~Guest27@2601:281:d47f:1590::2df) (Quit: Client closed) |
2022-06-22 03:23:30 +0200 | nate4 | (~nate@98.45.169.16) (Ping timeout: 268 seconds) |
2022-06-22 03:24:17 +0200 | stefan-_ | (~cri@42dots.de) (Ping timeout: 246 seconds) |
2022-06-22 03:28:03 +0200 | yauhsien | (~yauhsien@61-231-23-53.dynamic-ip.hinet.net) |
2022-06-22 03:28:42 +0200 | stackdroid18 | (14094@user/stackdroid) (Quit: Lost terminal) |
2022-06-22 03:29:06 +0200 | stefan-_ | (~cri@42dots.de) |
2022-06-22 03:30:39 +0200 | xff0x | (~xff0x@125x103x176x34.ap125.ftth.ucom.ne.jp) |
2022-06-22 03:33:13 +0200 | alp__ | (~alp@user/alp) (Ping timeout: 248 seconds) |
2022-06-22 03:34:15 +0200 | esrh | (~user@res404s-128-61-105-50.res.gatech.edu) (Remote host closed the connection) |
2022-06-22 03:50:01 +0200 | yauhsien | (~yauhsien@61-231-23-53.dynamic-ip.hinet.net) (Ping timeout: 268 seconds) |
2022-06-22 03:56:12 +0200 | stiell | (~stiell@gateway/tor-sasl/stiell) (Ping timeout: 268 seconds) |
2022-06-22 04:04:33 +0200 | nibelungen | (~asturias@2001:19f0:7001:638:5400:3ff:fef3:8725) |
2022-06-22 04:07:04 +0200 | kimjetwav | (~user@2607:fea8:2340:da00:1282:4dfa:aaca:27db) |
2022-06-22 04:09:51 +0200 | stiell | (~stiell@gateway/tor-sasl/stiell) |
2022-06-22 04:20:36 +0200 | FinnElija | (~finn_elij@user/finn-elija/x-0085643) (Killed (NickServ (Forcing logout FinnElija -> finn_elija))) |
2022-06-22 04:20:36 +0200 | finn_elija | (~finn_elij@user/finn-elija/x-0085643) |
2022-06-22 04:20:36 +0200 | finn_elija | FinnElija |
2022-06-22 04:26:31 +0200 | Unicorn_Princess | (~Unicorn_P@93-103-228-248.dynamic.t-2.net) (Remote host closed the connection) |
2022-06-22 04:27:24 +0200 | frost | (~frost@user/frost) |
2022-06-22 04:36:38 +0200 | liz | (~liz@cpc84585-newc17-2-0-cust60.16-2.cable.virginm.net) (Ping timeout: 240 seconds) |
2022-06-22 04:37:48 +0200 | jao | (~jao@cpc103048-sgyl39-2-0-cust502.18-2.cable.virginm.net) (Ping timeout: 276 seconds) |
2022-06-22 04:39:45 +0200 | mikoto-chan | (~mikoto-ch@esm-84-240-99-143.netplaza.fi) (Ping timeout: 276 seconds) |
2022-06-22 04:48:00 +0200 | yrlnry | (~yrlnry@pool-108-2-150-109.phlapa.fios.verizon.net) (Remote host closed the connection) |
2022-06-22 04:56:26 +0200 | esrh | (~user@res404s-128-61-105-50.res.gatech.edu) |
2022-06-22 05:02:30 +0200 | td_ | (~td@muedsl-82-207-238-103.citykom.de) (Ping timeout: 264 seconds) |
2022-06-22 05:04:09 +0200 | td_ | (~td@muedsl-82-207-238-203.citykom.de) |
2022-06-22 05:04:18 +0200 | vysn | (~vysn@user/vysn) (Ping timeout: 264 seconds) |
2022-06-22 05:14:43 +0200 | notzmv | (~zmv@user/notzmv) |
2022-06-22 05:16:35 +0200 | waleee | (~waleee@2001:9b0:213:7200:cc36:a556:b1e8:b340) (Ping timeout: 244 seconds) |
2022-06-22 05:18:37 +0200 | yrlnry | (~yrlnry@pool-108-2-150-109.phlapa.fios.verizon.net) |
2022-06-22 05:21:07 +0200 | nate4 | (~nate@98.45.169.16) |
2022-06-22 05:22:45 +0200 | z0k | (~z0k@206.84.141.12) |
2022-06-22 05:22:58 +0200 | yrlnry | (~yrlnry@pool-108-2-150-109.phlapa.fios.verizon.net) (Ping timeout: 240 seconds) |
2022-06-22 05:41:39 +0200 | stiell | (~stiell@gateway/tor-sasl/stiell) (Ping timeout: 268 seconds) |
2022-06-22 05:42:12 +0200 | stiell | (~stiell@gateway/tor-sasl/stiell) |
2022-06-22 05:42:23 +0200 | [itchyjunk] | (~itchyjunk@user/itchyjunk/x-7353470) (Remote host closed the connection) |
2022-06-22 05:43:19 +0200 | esrh | (~user@res404s-128-61-105-50.res.gatech.edu) (Remote host closed the connection) |
2022-06-22 05:49:05 +0200 | yauhsien | (~yauhsien@61-231-23-53.dynamic-ip.hinet.net) |
2022-06-22 06:05:38 +0200 | nate4 | (~nate@98.45.169.16) (Ping timeout: 246 seconds) |
2022-06-22 06:09:39 +0200 | Vajb | (~Vajb@hag-jnsbng11-58c3a8-176.dhcp.inet.fi) (Read error: Connection reset by peer) |
2022-06-22 06:10:09 +0200 | Vajb | (~Vajb@2001:999:40:4c50:1b24:879c:6df3:1d06) |
2022-06-22 06:14:02 +0200 | Sgeo_ | (~Sgeo@user/sgeo) |
2022-06-22 06:14:04 +0200 | Kaipei | (~Kaiepi@156.34.47.253) |
2022-06-22 06:14:34 +0200 | Feuermagier_ | (~Feuermagi@138.199.36.237) |
2022-06-22 06:14:42 +0200 | apache2 | (apache2@anubis.0x90.dk) |
2022-06-22 06:15:14 +0200 | Katarushisu4 | (~Katarushi@cpc147334-finc20-2-0-cust27.4-2.cable.virginm.net) |
2022-06-22 06:15:23 +0200 | EsoAlgo1 | (~EsoAlgo@129.146.136.145) |
2022-06-22 06:15:32 +0200 | elkcl_ | (~elkcl@broadband-37-110-156-162.ip.moscow.rt.ru) |
2022-06-22 06:15:36 +0200 | ulvarref` | (~user@188.124.56.153) |
2022-06-22 06:15:47 +0200 | Natch | (~natch@c-9e07225c.038-60-73746f7.bbcust.telenor.se) (Ping timeout: 246 seconds) |
2022-06-22 06:15:47 +0200 | lambdabot | (~lambdabot@haskell/bot/lambdabot) (Ping timeout: 246 seconds) |
2022-06-22 06:16:29 +0200 | AlexZenon | (~alzenon@178.34.160.206) (Ping timeout: 246 seconds) |
2022-06-22 06:16:29 +0200 | elkcl | (~elkcl@broadband-37-110-156-162.ip.moscow.rt.ru) (Ping timeout: 246 seconds) |
2022-06-22 06:16:29 +0200 | Henkru | (~henkru@kapsi.fi) (Ping timeout: 246 seconds) |
2022-06-22 06:16:29 +0200 | elkcl_ | elkcl |
2022-06-22 06:16:50 +0200 | tv | (~tv@user/tv) (Ping timeout: 246 seconds) |
2022-06-22 06:16:50 +0200 | gentauro | (~gentauro@user/gentauro) (Ping timeout: 246 seconds) |
2022-06-22 06:16:50 +0200 | drewr | (~drew@user/drewr) (Ping timeout: 246 seconds) |
2022-06-22 06:16:50 +0200 | Katarushisu | (~Katarushi@cpc147334-finc20-2-0-cust27.4-2.cable.virginm.net) (Ping timeout: 246 seconds) |
2022-06-22 06:16:50 +0200 | turlando | (~turlando@user/turlando) (Ping timeout: 246 seconds) |
2022-06-22 06:16:50 +0200 | EsoAlgo | (~EsoAlgo@129.146.136.145) (Ping timeout: 246 seconds) |
2022-06-22 06:16:50 +0200 | echoreply | (~echoreply@45.32.163.16) (Ping timeout: 246 seconds) |
2022-06-22 06:16:50 +0200 | Guest1698 | (~Guest1698@20.83.116.49) (Ping timeout: 246 seconds) |
2022-06-22 06:16:50 +0200 | Katarushisu4 | Katarushisu |
2022-06-22 06:16:50 +0200 | EsoAlgo1 | EsoAlgo |
2022-06-22 06:17:11 +0200 | Kaiepi | (~Kaiepi@156.34.47.253) (Ping timeout: 246 seconds) |
2022-06-22 06:17:11 +0200 | Sgeo | (~Sgeo@user/sgeo) (Ping timeout: 246 seconds) |
2022-06-22 06:17:11 +0200 | ulvarrefr | (~user@188.124.56.153) (Ping timeout: 246 seconds) |
2022-06-22 06:17:11 +0200 | apache | (apache2@anubis.0x90.dk) (Ping timeout: 246 seconds) |
2022-06-22 06:17:11 +0200 | Dykam | (Dykam@dykam.nl) (Ping timeout: 246 seconds) |
2022-06-22 06:17:11 +0200 | Feuermagier | (~Feuermagi@user/feuermagier) (Ping timeout: 246 seconds) |
2022-06-22 06:17:11 +0200 | ezzieyguywuf | (~Unknown@user/ezzieyguywuf) (Ping timeout: 246 seconds) |
2022-06-22 06:17:11 +0200 | hughjfchen | (~hughjfche@vmi556545.contaboserver.net) (Ping timeout: 246 seconds) |
2022-06-22 06:17:11 +0200 | JimL | (~quassel@89-162-2-132.fiber.signal.no) (Ping timeout: 246 seconds) |
2022-06-22 06:17:25 +0200 | turlando | (~turlando@93.51.40.51) |
2022-06-22 06:17:25 +0200 | turlando | (~turlando@93.51.40.51) (Changing host) |
2022-06-22 06:17:25 +0200 | turlando | (~turlando@user/turlando) |
2022-06-22 06:17:43 +0200 | lambdabot | (~lambdabot@silicon.int-e.eu) |
2022-06-22 06:17:43 +0200 | lambdabot | (~lambdabot@silicon.int-e.eu) (Changing host) |
2022-06-22 06:17:43 +0200 | lambdabot | (~lambdabot@haskell/bot/lambdabot) |
2022-06-22 06:17:48 +0200 | Dykam | (Dykam@dykam.nl) |
2022-06-22 06:17:49 +0200 | JimL | (~quassel@89-162-2-132.fiber.signal.no) |
2022-06-22 06:18:22 +0200 | Henkru | (henkru@kapsi.fi) |
2022-06-22 06:18:54 +0200 | gentauro | (~gentauro@user/gentauro) |
2022-06-22 06:19:06 +0200 | hughjfchen | (~hughjfche@vmi556545.contaboserver.net) |
2022-06-22 06:19:07 +0200 | ezzieyguywuf | (~Unknown@user/ezzieyguywuf) |
2022-06-22 06:20:37 +0200 | odnes | (~odnes@5-203-249-68.pat.nym.cosmote.net) |
2022-06-22 06:20:45 +0200 | Natch | (~natch@c-9e07225c.038-60-73746f7.bbcust.telenor.se) |
2022-06-22 06:20:53 +0200 | AlexZenon | (~alzenon@178.34.160.206) |
2022-06-22 06:21:55 +0200 | chexum | (~quassel@gateway/tor-sasl/chexum) (Quit: No Ping reply in 180 seconds.) |
2022-06-22 06:23:32 +0200 | chexum | (~quassel@gateway/tor-sasl/chexum) |
2022-06-22 06:27:05 +0200 | _73 | (~user@pool-108-49-252-36.bstnma.fios.verizon.net) |
2022-06-22 06:30:19 +0200 | echoreply | (~echoreply@2001:19f0:9002:1f3b:5400:ff:fe6f:8b8d) |
2022-06-22 06:30:30 +0200 | Guest1698 | (~Guest1698@20.83.116.49) |
2022-06-22 06:31:00 +0200 | drewr | (~drew@user/drewr) |
2022-06-22 06:31:10 +0200 | tv | (~tv@user/tv) |
2022-06-22 06:31:18 +0200 | yauhsien | (~yauhsien@61-231-23-53.dynamic-ip.hinet.net) (Ping timeout: 264 seconds) |
2022-06-22 06:45:54 +0200 | yrlnry | (~yrlnry@pool-108-2-150-109.phlapa.fios.verizon.net) |
2022-06-22 06:48:44 +0200 | yauhsien | (~yauhsien@61-231-23-53.dynamic-ip.hinet.net) |
2022-06-22 06:50:42 +0200 | yrlnry | (~yrlnry@pool-108-2-150-109.phlapa.fios.verizon.net) (Ping timeout: 268 seconds) |
2022-06-22 06:58:10 +0200 | odnes | (~odnes@5-203-249-68.pat.nym.cosmote.net) (Remote host closed the connection) |
2022-06-22 07:18:49 +0200 | mvk | (~mvk@2607:fea8:5ce3:8500::4588) (Ping timeout: 248 seconds) |
2022-06-22 07:18:58 +0200 | Kaipei | (~Kaiepi@156.34.47.253) (Ping timeout: 240 seconds) |
2022-06-22 07:31:05 +0200 | Teacup | (~teacup@user/teacup) (Quit: No Ping reply in 180 seconds.) |
2022-06-22 07:32:42 +0200 | Teacup | (~teacup@user/teacup) |
2022-06-22 07:41:13 +0200 | michalz | (~michalz@185.246.204.107) |
2022-06-22 07:42:40 +0200 | mjs22 | (~mjs22@76.115.19.239) |
2022-06-22 07:48:14 +0200 | causal | (~user@50.35.83.177) |
2022-06-22 07:48:34 +0200 | takuan | (~takuan@178-116-218-225.access.telenet.be) |
2022-06-22 07:48:42 +0200 | mbuf | (~Shakthi@122.164.15.152) |
2022-06-22 07:50:18 +0200 | _ht | (~quassel@231-169-21-31.ftth.glasoperator.nl) |
2022-06-22 07:57:08 +0200 | jpds1 | (~jpds@gateway/tor-sasl/jpds) |
2022-06-22 08:19:34 +0200 | acidjnk_new | (~acidjnk@dynamic-046-114-169-114.46.114.pool.telefonica.de) |
2022-06-22 08:23:44 +0200 | eggplantade | (~Eggplanta@2600:1700:bef1:5e10:9:59a4:5055:fd8c) (Remote host closed the connection) |
2022-06-22 08:24:26 +0200 | mixfix41 | (~sdenynine@user/mixfix41) (Ping timeout: 268 seconds) |
2022-06-22 08:24:27 +0200 | _ht | (~quassel@231-169-21-31.ftth.glasoperator.nl) (Remote host closed the connection) |
2022-06-22 08:25:25 +0200 | HotblackDesiato | (~HotblackD@gateway/tor-sasl/hotblackdesiato) (Remote host closed the connection) |
2022-06-22 08:25:28 +0200 | jgeerds | (~jgeerds@55d45f48.access.ecotel.net) |
2022-06-22 08:25:46 +0200 | HotblackDesiato | (~HotblackD@gateway/tor-sasl/hotblackdesiato) |
2022-06-22 08:26:32 +0200 | vysn | (~vysn@user/vysn) |
2022-06-22 08:31:57 +0200 | Sgeo_ | (~Sgeo@user/sgeo) (Read error: Connection reset by peer) |
2022-06-22 08:37:16 +0200 | dsrt^ | (~dsrt@50.237.44.186) |
2022-06-22 08:42:36 +0200 | chexum | (~quassel@gateway/tor-sasl/chexum) (Remote host closed the connection) |
2022-06-22 08:45:01 +0200 | chexum | (~quassel@gateway/tor-sasl/chexum) |
2022-06-22 08:50:39 +0200 | leeb | (~leeb@KD106155002239.au-net.ne.jp) |
2022-06-22 08:57:28 +0200 | kimjetwav | (~user@2607:fea8:2340:da00:1282:4dfa:aaca:27db) (Remote host closed the connection) |
2022-06-22 08:57:53 +0200 | kimjetwav | (~user@2607:fea8:2340:da00:b4b3:9de1:4864:1487) |
2022-06-22 08:59:02 +0200 | BusConscious | (~martin@ip5f5bdedc.dynamic.kabel-deutschland.de) |
2022-06-22 09:00:15 +0200 | kimjetwav | (~user@2607:fea8:2340:da00:b4b3:9de1:4864:1487) (Remote host closed the connection) |
2022-06-22 09:00:40 +0200 | kimjetwav | (~user@2607:fea8:2340:da00:487c:b90f:99a5:bda3) |
2022-06-22 09:01:36 +0200 | <BusConscious> | hello everyone |
2022-06-22 09:01:40 +0200 | <BusConscious> | Could not find module ‘Data.ByteString.UTF8’ |
2022-06-22 09:02:16 +0200 | <BusConscious> | What's going on there? I do have bytestring installed both globally and locally as dependency in my cabal |
2022-06-22 09:02:16 +0200 | jonathanx | (~jonathan@dyn-5-sc.cdg.chalmers.se) |
2022-06-22 09:02:30 +0200 | <Axman6> | do you have a version of bytestring which has that module installed? |
2022-06-22 09:02:30 +0200 | <BusConscious> | and I can import Data.ByteString |
2022-06-22 09:02:45 +0200 | Infinite | (~Infinite@2405:204:5381:d6e2:eefe:bfdb:b3b1:f5f4) |
2022-06-22 09:02:50 +0200 | <Axman6> | https://hackage.haskell.org/package/bytestring doesn't export that module |
2022-06-22 09:03:26 +0200 | <tomsmeding> | BusConscious: utf8-string exports that |
2022-06-22 09:03:36 +0200 | <tomsmeding> | what docs told you it was in 'bytestring'? |
2022-06-22 09:04:27 +0200 | <BusConscious> | I want to convert Strings to ByteStrings back and forth |
2022-06-22 09:04:38 +0200 | <Axman6> | I'm pretty sure it's never been part of bytestring |
2022-06-22 09:04:45 +0200 | <BusConscious> | (I don't want to do that, but I have to) |
2022-06-22 09:04:48 +0200 | <Axman6> | Have you looked at the text package? |
2022-06-22 09:04:52 +0200 | <tomsmeding> | I've been using utf8-string for that, seems to work well enough |
2022-06-22 09:04:58 +0200 | <tomsmeding> | that exports Data.ByteString.UTF8 |
2022-06-22 09:06:01 +0200 | lagash | (lagash@lagash.shelltalk.net) (Ping timeout: 248 seconds) |
2022-06-22 09:08:10 +0200 | dschrempf | (~dominik@070-207.dynamic.dsl.fonira.net) |
2022-06-22 09:08:13 +0200 | <BusConscious> | ok that seems to work |
2022-06-22 09:08:30 +0200 | kimjetwav | (~user@2607:fea8:2340:da00:487c:b90f:99a5:bda3) (Ping timeout: 264 seconds) |
2022-06-22 09:08:35 +0200 | <Axman6> | BusConscious: can you tell us more about what you actually want to do? because working with text is usually something we'd do using the text package |
2022-06-22 09:08:36 +0200 | frost | (~frost@user/frost) (Quit: Client closed) |
2022-06-22 09:12:58 +0200 | bitdex | (~bitdex@gateway/tor-sasl/bitdex) (Remote host closed the connection) |
2022-06-22 09:13:57 +0200 | bitdex | (~bitdex@gateway/tor-sasl/bitdex) |
2022-06-22 09:14:20 +0200 | <BusConscious> | Axman6: So I'm trying to write a unix shell and people here have been telling me to not use String to represent filepaths and my string, because there is no requirement in POSIX, that these things should be utf-8 or whatever, which is an argument I can see for sure |
2022-06-22 09:14:30 +0200 | gmg | (~user@user/gehmehgeh) |
2022-06-22 09:15:18 +0200 | <tomsmeding> | but you _are_ convering some things to String at some point, apparently |
2022-06-22 09:16:05 +0200 | <tomsmeding> | I guess what Axman6 is getting at is that you can probably replace all your uses of String with Text |
2022-06-22 09:16:05 +0200 | <BusConscious> | yes, because it's so much easier to work with and a lot of functions in the library only accept String |
2022-06-22 09:16:15 +0200 | <BusConscious> | https://hackage.haskell.org/package/unix-2.7.2.2/docs/System-Posix-IO.html |
2022-06-22 09:16:32 +0200 | <tomsmeding> | BusConscious: that sounds like a weird argument. We want to represent non-utf8 things, but then we work with it as strings because that's easier |
2022-06-22 09:16:56 +0200 | <BusConscious> | Even in the POSIX API FilePath is accepted which is a type synonym of String |
2022-06-22 09:17:24 +0200 | <tomsmeding> | right |
2022-06-22 09:17:50 +0200 | elkcl_ | (~elkcl@broadband-37-110-156-162.ip.moscow.rt.ru) |
2022-06-22 09:17:58 +0200 | elkcl | (~elkcl@broadband-37-110-156-162.ip.moscow.rt.ru) (Ping timeout: 240 seconds) |
2022-06-22 09:17:58 +0200 | elkcl_ | elkcl |
2022-06-22 09:18:06 +0200 | <tomsmeding> | bytestring does have IO functions, but not to FDs https://hackage.haskell.org/package/bytestring-0.11.3.1/docs/Data-ByteString.html#v:hPut |
2022-06-22 09:18:41 +0200 | <tomsmeding> | like, if you're doing the IO in String form, why even use ByteString internally |
2022-06-22 09:20:20 +0200 | <BusConscious> | ok so I should stick to either String or Text? On the other hand I might have to use FFI again and converting to and from a CString may be easier with a ByteString.. |
2022-06-22 09:20:31 +0200 | <BusConscious> | or I use text |
2022-06-22 09:20:43 +0200 | <tomsmeding> | if you want to avoid assuming UTF8, you should stick to ByteString :p |
2022-06-22 09:20:49 +0200 | benin0 | (~benin@183.82.26.120) |
2022-06-22 09:20:56 +0200 | <tomsmeding> | https://hackage.haskell.org/package/base-4.14.0.0/docs/GHC-IO-FD.html has FD I/O functions, though with Ptr |
2022-06-22 09:21:11 +0200 | <tomsmeding> | if you are okay with assuming UTF8, use Text |
2022-06-22 09:21:23 +0200 | tomsmeding | is guilty of over-using String as well, but am lazy |
2022-06-22 09:21:45 +0200 | coot | (~coot@213.134.190.95) |
2022-06-22 09:22:17 +0200 | <tomsmeding> | (that String argument to the read and write functions is simply a function name used in error messages) |
2022-06-22 09:22:34 +0200 | acidjnk | (~acidjnk@dynamic-046-114-169-114.46.114.pool.telefonica.de) |
2022-06-22 09:23:19 +0200 | Everything | (~Everythin@37.115.210.35) |
2022-06-22 09:24:17 +0200 | eggplantade | (~Eggplanta@2600:1700:bef1:5e10:9:59a4:5055:fd8c) |
2022-06-22 09:24:38 +0200 | mc47 | (~mc47@xmonad/TheMC47) |
2022-06-22 09:25:11 +0200 | lortabac | (~lortabac@2a01:e0a:541:b8f0:2cd:7ecf:235f:1481) |
2022-06-22 09:25:18 +0200 | acidjnk_new | (~acidjnk@dynamic-046-114-169-114.46.114.pool.telefonica.de) (Ping timeout: 240 seconds) |
2022-06-22 09:26:08 +0200 | ubert | (~Thunderbi@p200300ecdf0da56677798f1bce3bed29.dip0.t-ipconnect.de) |
2022-06-22 09:27:16 +0200 | <BusConscious> | I think I will stick with String for now. It may not be ideal because it assumes UTF8, but having to competing string types is such a pain in the ass. I won't get any joy out of fiddling all these types together. |
2022-06-22 09:27:41 +0200 | <tomsmeding> | if the goal is having fun, then do whatever you want :p |
2022-06-22 09:27:51 +0200 | mixfix41 | (~sdenynine@user/mixfix41) |
2022-06-22 09:28:14 +0200 | <tomsmeding> | if you were writing production software, I would advise heeding the advice here more |
2022-06-22 09:28:17 +0200 | ccntrq | (~Thunderbi@dynamic-077-003-064-244.77.3.pool.telefonica.de) |
2022-06-22 09:28:21 +0200 | <tomsmeding> | but don't let the real world spoil the fun |
2022-06-22 09:28:57 +0200 | eggplantade | (~Eggplanta@2600:1700:bef1:5e10:9:59a4:5055:fd8c) (Ping timeout: 248 seconds) |
2022-06-22 09:30:05 +0200 | <BusConscious> | One last Q: What happens if I write something like "\xff" in haskell? How is that represented byte-wise? |
2022-06-22 09:30:22 +0200 | <Axman6> | BusConscious: you've actually come across one of the more complicated situations where it's not immediately clear what you should use - for IO of data, it sounds like ByteString is the way to go, just (for now) assume you don't need to care about encoding; if you're piping strout from one process into another's stdin, just send whatever bytes you get. as for file paths, that is more complex, because Haskell's use of String is not a good choice |
2022-06-22 09:31:26 +0200 | <tomsmeding> | BusConscious: https://tomsmeding.com/ss/get/tomsmeding/SX1k9z |
2022-06-22 09:31:41 +0200 | ccntrq1 | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) |
2022-06-22 09:31:49 +0200 | rendar | (~rendar@user/rendar) (Ping timeout: 244 seconds) |
2022-06-22 09:32:03 +0200 | <Axman6> | BusConscious: describing them as competing string types isn't really fair, they all have their uses and their own tradeoffs |
2022-06-22 09:32:57 +0200 | ccntrq | (~Thunderbi@dynamic-077-003-064-244.77.3.pool.telefonica.de) (Ping timeout: 256 seconds) |
2022-06-22 09:32:57 +0200 | ccntrq1 | ccntrq |
2022-06-22 09:33:29 +0200 | <Axman6> | "\xff" depends on what type that string looking thing is. if it's a haskell String, then you'll just have ['\xff']. if it's a text (2.0) it'll be the two byte encoding of the codepoint for 255 |
2022-06-22 09:34:56 +0200 | <Axman6> | if it's a text < 1.0 Text, then it'll be the UTF-16 string with the codepoint 255 in it |
2022-06-22 09:35:29 +0200 | <BusConscious> | So what is '\xff' then is it a two byte encoding of the codepoint 255 as well? |
2022-06-22 09:35:45 +0200 | <tomsmeding> | '\xff' of type Char? |
2022-06-22 09:35:49 +0200 | <BusConscious> | yes |
2022-06-22 09:35:51 +0200 | <Axman6> | '\xff' is a Char, which represents a unicode codepoint |
2022-06-22 09:36:03 +0200 | <tomsmeding> | Char is just an Int internally |
2022-06-22 09:36:11 +0200 | <Axman6> | @src Char |
2022-06-22 09:36:11 +0200 | <lambdabot> | data Char = C# Char# |
2022-06-22 09:36:16 +0200 | <Axman6> | @src Char# |
2022-06-22 09:36:16 +0200 | <lambdabot> | Source not found. There are some things that I just don't know. |
2022-06-22 09:36:19 +0200 | <Axman6> | :( |
2022-06-22 09:36:48 +0200 | <tomsmeding> | https://hackage.haskell.org/package/ghc-prim-0.8.0/docs/src/GHC.Prim.html#Char%23 |
2022-06-22 09:36:54 +0200 | <Axman6> | but ywah, Char# is basically (or actually?) an Int# (or Int32#?) |
2022-06-22 09:37:00 +0200 | Infinite | (~Infinite@2405:204:5381:d6e2:eefe:bfdb:b3b1:f5f4) (Ping timeout: 252 seconds) |
2022-06-22 09:37:02 +0200 | <BusConscious> | Unicode sequences can be at most 3 or 4 bytes right, so they fit in a Int32 |
2022-06-22 09:37:02 +0200 | <tomsmeding> | seems to be a primitive type |
2022-06-22 09:37:05 +0200 | <BusConscious> | makes sense |
2022-06-22 09:37:36 +0200 | <Axman6> | Chars are not utf-8 sequences, they are codepoints, they're just a number |
2022-06-22 09:37:48 +0200 | Infinite | (~Infinite@49.39.123.213) |
2022-06-22 09:38:14 +0200 | <Axman6> | utf-8 is an encoding, which is what text now uses internally (it used to use utf-16, which was the worst of both worlds of utf-8 and utf-32) |
2022-06-22 09:38:39 +0200 | eod|fserucas | (~eod|fseru@193.65.114.89.rev.vodafone.pt) |
2022-06-22 09:38:43 +0200 | eod|fserucas_ | (~eod|fseru@193.65.114.89.rev.vodafone.pt) |
2022-06-22 09:39:16 +0200 | <Axman6> | A Char may be written, when encoded as utf-8, using 1, 2, 3 or 4 bytes, but a Char is always a 32 bit integer (probably, can't confirm from the link above but I believe that's true) |
2022-06-22 09:39:19 +0200 | <tomsmeding> | finally, found the definition https://hackage.haskell.org/package/base-4.16.0.0/docs/src/GHC.Base.html#line-189 |
2022-06-22 09:39:37 +0200 | <tomsmeding> | sizeOf on Char returns 4, so presumably |
2022-06-22 09:45:54 +0200 | jgeerds | (~jgeerds@55d45f48.access.ecotel.net) (Ping timeout: 276 seconds) |
2022-06-22 09:46:18 +0200 | jonathanx | (~jonathan@dyn-5-sc.cdg.chalmers.se) (Ping timeout: 240 seconds) |
2022-06-22 09:47:09 +0200 | Infinite | (~Infinite@49.39.123.213) (Quit: Client closed) |
2022-06-22 09:47:27 +0200 | kuribas | (~user@ip-188-118-57-242.reverse.destiny.be) |
2022-06-22 09:48:59 +0200 | machinedgod | (~machinedg@66.244.246.252) |
2022-06-22 09:51:42 +0200 | dsrt^ | (~dsrt@50.237.44.186) (Ping timeout: 264 seconds) |
2022-06-22 09:51:56 +0200 | MajorBiscuit | (~MajorBisc@wlan-145-94-167-213.wlan.tudelft.nl) |
2022-06-22 09:53:21 +0200 | mjs22 | (~mjs22@76.115.19.239) (Quit: Leaving) |
2022-06-22 09:54:42 +0200 | progress__ | (~fffuuuu_i@45.112.243.220) |
2022-06-22 09:54:56 +0200 | raym | (~raym@user/raym) (Remote host closed the connection) |
2022-06-22 09:58:10 +0200 | AlexNoo_ | AlexNoo |
2022-06-22 10:03:40 +0200 | nate4 | (~nate@98.45.169.16) |
2022-06-22 10:06:28 +0200 | gurkenglas | (~gurkengla@dslb-002-207-014-022.002.207.pools.vodafone-ip.de) |
2022-06-22 10:08:10 +0200 | nate4 | (~nate@98.45.169.16) (Ping timeout: 240 seconds) |
2022-06-22 10:10:33 +0200 | Guest92 | (~Guest92@2600:1000:b166:c4f9:ac93:f08:cf56:c856) |
2022-06-22 10:10:40 +0200 | Guest92 | (~Guest92@2600:1000:b166:c4f9:ac93:f08:cf56:c856) () |
2022-06-22 10:11:04 +0200 | tzh | (~tzh@c-24-21-73-154.hsd1.wa.comcast.net) (Quit: zzz) |
2022-06-22 10:11:55 +0200 | <merijn> | Axman6: tbh, probably more :p |
2022-06-22 10:11:58 +0200 | <merijn> | Axman6: Char is boxed |
2022-06-22 10:12:59 +0200 | <merijn> | I like how 20(?) year after it was written I still have to link people to Joel's unicode blog |
2022-06-22 10:13:59 +0200 | <Maxdamantus> | I wonder when there'll be a standard Unicode string type in Haskell. |
2022-06-22 10:14:19 +0200 | <Maxdamantus> | (rather than one that's only limited to well-formed Unicode strings) |
2022-06-22 10:16:11 +0200 | Kaipei | (~Kaiepi@156.34.47.253) |
2022-06-22 10:16:22 +0200 | jonathanx | (~jonathan@h-178-174-176-109.A357.priv.bahnhof.se) |
2022-06-22 10:18:47 +0200 | <merijn> | Maxdamantus: Please define well-formed unicode string :D |
2022-06-22 10:18:52 +0200 | <merijn> | See you in about a year |
2022-06-22 10:19:28 +0200 | <merijn> | oh, wait, I read that inverted |
2022-06-22 10:19:56 +0200 | <Maxdamantus> | merijn: these things are clearly defined in the Unicode standard. The `Text` library is limited to well-formed Unicode strings, but according to the Unicode standard, Unicode strings are not necessarily well-formed and are specifically allowed to be any sequence of code units (of a particular type). |
2022-06-22 10:20:58 +0200 | <Maxdamantus> | and the string libraries that are backed by the Unicode consortium (eg, ICU and Java `String`s) work according to the standard. |
2022-06-22 10:21:25 +0200 | <merijn> | Maxdamantus: I don't think Text is limited to well-formed unicode, is it? |
2022-06-22 10:21:51 +0200 | <Maxdamantus> | merijn: it certainly was last time I looked at it (when it was opaquely based on UTF-16). |
2022-06-22 10:22:21 +0200 | <Maxdamantus> | aiui they switched from opaque UTF-16 to opaque UTF-8, but that's mostly an implementation detail, since they don't support storing arbitrary code units. |
2022-06-22 10:22:26 +0200 | <merijn> | Maxdamantus: I mean, you can always use ByteString and text-icu for more niche use cases |
2022-06-22 10:22:31 +0200 | cfricke | (~cfricke@user/cfricke) |
2022-06-22 10:22:38 +0200 | chomwitt | (~chomwitt@2a02:587:dc0d:e600:1174:892d:39e3:5e01) |
2022-06-22 10:22:58 +0200 | <Maxdamantus> | Right, but I meant I was wondering when there'd be a reasonably ubiquitous Unicode string type. |
2022-06-22 10:23:18 +0200 | <Maxdamantus> | It's kind of crap that Unicode is only handled properly if you use `ByteString`s or ICU |
2022-06-22 10:23:18 +0200 | <merijn> | What purpose would that serve/ |
2022-06-22 10:23:47 +0200 | <merijn> | I mean, you seem to have a very specific and niche definition of "handled properly" that is not helpful to 99% of code |
2022-06-22 10:23:47 +0200 | <Maxdamantus> | People could write proper APIs involving things like file names. |
2022-06-22 10:24:03 +0200 | <merijn> | Maxdamantus: No, that wouldn't be fixed by that |
2022-06-22 10:24:16 +0200 | <merijn> | Since the fundamental problem is file name APIs being different across platforms |
2022-06-22 10:24:41 +0200 | Neuromancer | (~Neuromanc@user/neuromancer) |
2022-06-22 10:24:41 +0200 | <Maxdamantus> | There's a fairly sane way of representing them as UTF-8 strings on all of the common platforms. |
2022-06-22 10:25:00 +0200 | dschrempf | (~dominik@070-207.dynamic.dsl.fonira.net) (Quit: WeeChat 3.5) |
2022-06-22 10:25:16 +0200 | eggplantade | (~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) |
2022-06-22 10:25:19 +0200 | <Maxdamantus> | particularly, pass through the bytes as-is on POSIX systems, and convert to WTF-8 on win32. |
2022-06-22 10:25:52 +0200 | <tomsmeding> | how is a sequence of random bytes (not including 0 and '/', okay) even a non-well-formed unicode string |
2022-06-22 10:26:01 +0200 | <tomsmeding> | in what encoding |
2022-06-22 10:26:04 +0200 | <merijn> | Maxdamantus: Windows filenames are explicitly UTF-16 on windows |
2022-06-22 10:26:14 +0200 | <merijn> | Maxdamantus: On linux they're "nothing remotely resembling unicode" |
2022-06-22 10:26:43 +0200 | <merijn> | On macOS they're "UTF-16, except that's no longer enforced by the low level filesystem APIs, only the high level ones, RIP you" |
2022-06-22 10:26:44 +0200 | <Maxdamantus> | tomsmeding: a Unicode string is a sequence of code units. In particular, a UTF-8 Unicode string is a sequence of bytes. |
2022-06-22 10:27:04 +0200 | <Maxdamantus> | I can quote the Unicode standard. |
2022-06-22 10:27:11 +0200 | <tomsmeding> | Maxdamantus: right, and not every linux filename is valid utf8. They _are_ all valid latin1, but then every byte sequence is valid latin1 |
2022-06-22 10:27:23 +0200 | <tomsmeding> | but using latin1 encoding for windows filenames makes no sense |
2022-06-22 10:27:37 +0200 | <Maxdamantus> | tomsmeding: assuming by "valid" you mean "well-formed", that's not what I'm talking about. |
2022-06-22 10:27:49 +0200 | <Maxdamantus> | https://www.unicode.org/versions/Unicode14.0.0/ch03.pdf#G7404 |
2022-06-22 10:27:52 +0200 | <merijn> | Maxdamantus: No, valid means "it's actually unicode" |
2022-06-22 10:28:10 +0200 | <Maxdamantus> | merijn: no. |
2022-06-22 10:28:12 +0200 | <Maxdamantus> | > D80 Unicode string: A code unit sequence containing code units of a particular Unicode |
2022-06-22 10:28:13 +0200 | <lambdabot> | <hint>:1:64: error: parse error on input ‘of’ |
2022-06-22 10:28:15 +0200 | <Maxdamantus> | encoding form. |
2022-06-22 10:28:28 +0200 | <Maxdamantus> | D78 Code unit sequence: An ordered sequence of one or more code units |
2022-06-22 10:28:33 +0200 | <Maxdamantus> | When the code unit is an 8-bit unit, a code unit sequence may also be referred |
2022-06-22 10:28:33 +0200 | <Maxdamantus> | to as a byte sequence. |
2022-06-22 10:28:48 +0200 | <Maxdamantus> | "Unicode string" does *NOT* mean well-formed (or "valid") Unicode. |
2022-06-22 10:28:59 +0200 | <Maxdamantus> | The Unicode standard makes that fairly explicit in various places. |
2022-06-22 10:29:10 +0200 | <tomsmeding> | Maxdamantus: so for my understanding, removing well-formedness from the requirements not only makes incompatible sequences of code points allowed (e.g. modifiers that don't work on particular characters, or unpaired surrogates), but also something that doesn't even decode as individual utf8 code points? |
2022-06-22 10:29:13 +0200 | <Maxdamantus> | if you look up "Unicode string" in the glossary, they will even explicitly say that there. |
2022-06-22 10:29:28 +0200 | <Maxdamantus> | > Unicode String. A code unit sequence containing code units of a particular Unicode encoding form (whether well-formed or not). (See definition D80 in Section 3.9, Unicode Encoding Forms.) |
2022-06-22 10:29:29 +0200 | <lambdabot> | <hint>:1:60: error: parse error on input ‘of’ |
2022-06-22 10:29:35 +0200 | eggplantade | (~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) (Ping timeout: 255 seconds) |
2022-06-22 10:29:38 +0200 | <merijn> | Maxdamantus: That disagrees what you're saying |
2022-06-22 10:29:45 +0200 | <merijn> | Maxdamantus: "a code unit sequence" |
2022-06-22 10:29:59 +0200 | <Maxdamantus> | merijn: right, that's what I've been saying. |
2022-06-22 10:29:59 +0200 | <merijn> | Maxdamantus: "Code unit: The minimal bit combination that can represent a unit of encoded text" |
2022-06-22 10:30:02 +0200 | <merijn> | for processing or interchange. |
2022-06-22 10:30:23 +0200 | <merijn> | Maxdamantus: I interpret that to mean only include *valid* encodings of unit code codepoints |
2022-06-22 10:30:38 +0200 | <merijn> | not all byte sequences are made up of only valid unicode codepoint encodings |
2022-06-22 10:30:50 +0200 | <Maxdamantus> | merijn: so what would be an example of a Unicode string that is not well-formed? |
2022-06-22 10:31:12 +0200 | <merijn> | Maxdamantus: A well-formed one says certain unicode codepoints can only occur before/after certain types of characters |
2022-06-22 10:31:17 +0200 | <tomsmeding> | Maxdamantus: is [\x80] a non-well-formed unicode string in utf8? |
2022-06-22 10:31:23 +0200 | <merijn> | think of "accent modifiers" like ` ' ^ |
2022-06-22 10:31:27 +0200 | <Maxdamantus> | merijn: no. That's not what well-formedness is. |
2022-06-22 10:31:39 +0200 | <Maxdamantus> | merijn: well-formedness does not involve interpretation of code points. |
2022-06-22 10:32:02 +0200 | <tomsmeding> | ah, see D84 |
2022-06-22 10:32:10 +0200 | <Maxdamantus> | merijn: well-formed simply means that it represents a sequence of Unicode scalar values. |
2022-06-22 10:32:41 +0200 | <Maxdamantus> | merijn: USVs can even be undefined, and you still have a well-formed Unicode string. |
2022-06-22 10:33:04 +0200 | <tomsmeding> | representing a non-well-formed unicode string then basically means either 1. storing (original bytes, purported encoding), or 2. some ugly sum type with various decode failures as options |
2022-06-22 10:33:11 +0200 | <merijn> | "Ill-formed: A Unicode code unit sequence that purports to be in a Unicode encoding" |
2022-06-22 10:33:14 +0200 | <merijn> | form is called ill-formed if and only if it does not follow the specification of that Unicode encoding form. |
2022-06-22 10:33:26 +0200 | <tomsmeding> | merijn: see the second bullet point under D84 about UTF8 |
2022-06-22 10:33:30 +0200 | <Maxdamantus> | Feel free to read the Unicode standard. I'm quite familiar with chapter 3, which defines all of these things. |
2022-06-22 10:33:56 +0200 | <Maxdamantus> | merijn: right, the Unicode forms are "UTF-8", "UTF-16" and "UTF-32". |
2022-06-22 10:34:09 +0200 | <merijn> | Maxdamantus: You suggested utf-8 for linux filenames |
2022-06-22 10:34:15 +0200 | <Maxdamantus> | merijn: those exist independently of any interpretation of actual code points. That happens elsewhere in the standard. |
2022-06-22 10:34:22 +0200 | <merijn> | Maxdamantus: linux filenames are ill-formed per D84 |
2022-06-22 10:34:42 +0200 | <merijn> | anyway, meeting |
2022-06-22 10:34:49 +0200 | <tomsmeding> | merijn: hence Maxdamantus is suggesting using a string type that can represent non-well-formed strings |
2022-06-22 10:34:54 +0200 | <Maxdamantus> | merijn: right, I would suggest treating filenames as UTF-8 Unicode strings, just ones that are not necessarily well-formed (aka, not necessarily "in UTF-8") |
2022-06-22 10:35:21 +0200 | <merijn> | Which seems rather inferior to the strictly more correct ByteString representation |
2022-06-22 10:35:28 +0200 | <tomsmeding> | Maxdamantus: how much of a performance penalty do you get from using such an implementation, as compared to one that can assume its internal representation _is_ well-formed |
2022-06-22 10:35:39 +0200 | <tomsmeding> | and what gains do you get :p |
2022-06-22 10:36:11 +0200 | <tomsmeding> | I'd expect that the only time when you want to get the "text-like" data in a linux filename is when you want to show it to the user -- and at that point you can just do a lenient UTF8 decode |
2022-06-22 10:36:19 +0200 | <Maxdamantus> | tomsmeding: it's a negative penalty. You pay a penalty when using restrictively well-formed strings because you need to check that the string is well-formed when reading it. |
2022-06-22 10:36:42 +0200 | <Maxdamantus> | tomsmeding: there's some interesting commentary around that in the documentation for Rust's `bstr` package. |
2022-06-22 10:37:07 +0200 | <Maxdamantus> | where the author gives examples of things like treating a mmapped file as a string. |
2022-06-22 10:37:36 +0200 | <Maxdamantus> | you can't do that if the string library requires the bytes be well-formed, since you'd have to scan through the entire file to check it before allowing it to be used. |
2022-06-22 10:37:43 +0200 | shriekingnoise | (~shrieking@201.212.175.181) (Quit: Quit) |
2022-06-22 10:37:45 +0200 | <Maxdamantus> | https://docs.rs/bstr/latest/bstr/ |
2022-06-22 10:39:06 +0200 | Henkru | (henkru@kapsi.fi) (Ping timeout: 264 seconds) |
2022-06-22 10:39:11 +0200 | <Maxdamantus> | I don't think there would be any significant performance benefits in any cases by restricting to well-formed strings. |
2022-06-22 10:39:19 +0200 | acidjnk | (~acidjnk@dynamic-046-114-169-114.46.114.pool.telefonica.de) (Leaving) |
2022-06-22 10:39:20 +0200 | <kritzefitz> | Maxdamantus: regardless of performance, having to expect Texts to contain invalid encodings sounds like a nightmare to me. It's already a common enough pitfall to assume that any input you receive is well encoded, having to catch those failures on almost all Text operations would be far harder to handle than just having to catch failures when decoding. |
2022-06-22 10:39:26 +0200 | <Maxdamantus> | it only has negative performance impacts due to the extra checking that needs to be done at boundaries. |
2022-06-22 10:39:35 +0200 | Henkru | (henkru@kapsi.fi) |
2022-06-22 10:40:13 +0200 | <Maxdamantus> | kritzefitz: that shouldn't be necessary. |
2022-06-22 10:40:42 +0200 | <Maxdamantus> | kritzefitz: the only operation that could "fail" would be iterating through code points, and that iterator could transparently emit replacement characters by default. |
2022-06-22 10:40:52 +0200 | <tomsmeding> | though UnnormText.unpack would return an Either |
2022-06-22 10:40:58 +0200 | <Maxdamantus> | (not that iterating through code points is a particularly common operation) |
2022-06-22 10:41:05 +0200 | <tomsmeding> | right |
2022-06-22 10:42:27 +0200 | <Maxdamantus> | What's `UnnormText.unpack`? |
2022-06-22 10:42:45 +0200 | <Maxdamantus> | in general these things shouldn't need to produce errors. |
2022-06-22 10:42:47 +0200 | <tomsmeding> | the unpack :: Text -> [Char] function of this hypothetical haskell library that implements non-well-formed strings |
2022-06-22 10:43:08 +0200 | <tomsmeding> | unless you expect it to round-trip with [Char] -> Text |
2022-06-22 10:43:19 +0200 | <tomsmeding> | that's going to fail if there are encoding errors in the bytestring |
2022-06-22 10:44:11 +0200 | <tomsmeding> | so you'd have unpackStrict :: Text -> Maybe [Char], or unpackStrict' :: Text -> [Either Word8 Char], or unpackLenient :: Text -> [Char] that replaces stuff with U+FFFD |
2022-06-22 10:44:18 +0200 | <Maxdamantus> | Right, I'd probably just expect it to emit replacement characters. |
2022-06-22 10:44:46 +0200 | <Maxdamantus> | since that's the normal thing to do when encountering errors while converting between Unicode representations. |
2022-06-22 10:45:14 +0200 | <tomsmeding> | I'd also want a round-tripping version, or at least one that alerts me that round-tripping isn't going to work |
2022-06-22 10:45:28 +0200 | <Maxdamantus> | eg, I suspect that's what will happen if you open a web browser console and do `document.body.innerHTML = "hello \ud800 world";` |
2022-06-22 10:45:49 +0200 | <tomsmeding> | it's not like we're dealing with the whole zoo of weird unicode encodings where you have 100% chance that _something_ in your text is going to be unrepresentable in one of those encodings |
2022-06-22 10:46:06 +0200 | <tomsmeding> | Maxdamantus: yes, but there we're dealing with UI :p |
2022-06-22 10:46:18 +0200 | <tomsmeding> | but yes, mostly one would use my unpackLenient |
2022-06-22 10:46:31 +0200 | <tomsmeding> | interesting, didn't think of this as an issue before |
2022-06-22 10:46:57 +0200 | <tomsmeding> | purists would say "what even are you thinking, linux filenames are not intended to be utf8 so treat them as bytestrings" |
2022-06-22 10:47:06 +0200 | <tomsmeding> | but practice says "well mostly they're utf8 mostly" |
2022-06-22 10:47:19 +0200 | <Maxdamantus> | Hm, interestingly Firefox actually renders the UTF-16 code unit as an error, and it converts it to a replacement character when doing something like copying to the clipboard. That's quite neat. |
2022-06-22 10:47:57 +0200 | <kritzefitz> | Maxdamantus: What do you expect to gain from a Text representation that allows badly encoded underlying bytestrings? Keeping the original bytes in a ByteString and only decoding when you want to do something that explicitly requires code points seems to me like it gets you the same behavior as you described. |
2022-06-22 10:48:22 +0200 | <Maxdamantus> | After people fix Unicode handling in programming languages, that's my next desire: text applications should be able to render UTF-8 errors, and it should be possible to copy the ill-formed UTF-8 arround without losing information. |
2022-06-22 10:48:34 +0200 | <tomsmeding> | kritzefitz: it would only give you convenience |
2022-06-22 10:49:05 +0200 | <tomsmeding> | Maxdamantus's example of Rust's bstr library explicitly says that most of its functionality you can obtain by piecing together existing code that e.g. does regex stuff on bytestrings |
2022-06-22 10:49:27 +0200 | Henkru | (henkru@kapsi.fi) (Ping timeout: 256 seconds) |
2022-06-22 10:49:36 +0200 | <Maxdamantus> | kritzefitz: it means you can use the correct type for things. Things like filenames should be strings, not `ByteStrings`, and they should have convenient handling of Unicode. |
2022-06-22 10:50:07 +0200 | <Maxdamantus> | kritzefitz: at the moment, API designers have to decide whether to use `ByteString` for correctness or `Text` for convenience. |
2022-06-22 10:50:25 +0200 | <Maxdamantus> | kritzefitz: if `Text` handled ill-formed Unicode strings, you'd get both with one type. |
2022-06-22 10:50:54 +0200 | <kritzefitz> | But what convenient handling of unicode do you get? When is a Text actually more convenient if you're not allowed to assume that it contains only valid code points? |
2022-06-22 10:51:07 +0200 | foul_owl | (~kerry@23.82.194.107) (Ping timeout: 260 seconds) |
2022-06-22 10:51:28 +0200 | <Maxdamantus> | kritzefitz: if `Text` is not more convenient, then why doesn't everyone just use `ByteString` for things like filenames or user input? |
2022-06-22 10:53:24 +0200 | <merijn> | Maxdamantus: Why should filenames be strings? |
2022-06-22 10:53:43 +0200 | <tomsmeding> | Maxdamantus: suggestion if you pitch this to people: avoid getting the unicode spec out. The argument does not rest on "non-well-formed" being defined by the unicode spec; the argument rests on convenience in software engineering. When you throw specs at people, they throw specs back, and the spec for linux filenames is _not_ that they are unicode |
2022-06-22 10:53:49 +0200 | <tomsmeding> | never mind reality where they mostly are |
2022-06-22 10:53:59 +0200 | <tomsmeding> | (but not always) |
2022-06-22 10:54:27 +0200 | <merijn> | Also, the ability to assume that any Text in your codebase will always remain valid unicode is pretty huge |
2022-06-22 10:54:28 +0200 | <Maxdamantus> | merijn: because there should be a ubiquitous string type for text. |
2022-06-22 10:54:29 +0200 | <tomsmeding> | the unicode spec just gives you precedent for your terminology, which is nice but not essential to the pitch |
2022-06-22 10:54:49 +0200 | <merijn> | Maxdamantus: Says who? |
2022-06-22 10:55:10 +0200 | <merijn> | If anything, I think we need *more* string types and better support for being polymorphic over them |
2022-06-22 10:55:31 +0200 | <tomsmeding> | (current IsString is awful for that) |
2022-06-22 10:55:49 +0200 | <Maxdamantus> | But what's the advantage of having the other string types? |
2022-06-22 10:55:53 +0200 | <merijn> | tomsmeding: I proposed a better interface for IsString and other polymorphic literals |
2022-06-22 10:56:03 +0200 | <merijn> | Maxdamantus: Different types are optimised for different uses |
2022-06-22 10:56:17 +0200 | <merijn> | tomsmeding: That's what led to validated-literals :p |
2022-06-22 10:56:25 +0200 | <Maxdamantus> | merijn: optimised in terms of API convenience, or optimised in terms of performance? |
2022-06-22 10:56:32 +0200 | <merijn> | Both |
2022-06-22 10:56:50 +0200 | <merijn> | Your proposal is less convenient for both for 99% of code |
2022-06-22 10:56:59 +0200 | <Maxdamantus> | I don't think you're going to get better performance by having different string types (in particular, I explained how it results in worse performance) |
2022-06-22 10:57:23 +0200 | <Maxdamantus> | and I don't think you get better API convenience either. As I said, it means that API designers have to pick from various string types. |
2022-06-22 10:57:55 +0200 | <merijn> | Maxdamantus: "i can never trust any string in my entire codebase" is a pretty fucking massive downgrade in API usability, no matter what else you propose |
2022-06-22 10:58:05 +0200 | <Maxdamantus> | which string library do I import again when using library xyz? |
2022-06-22 10:58:28 +0200 | raym | (~raym@user/raym) |
2022-06-22 10:58:33 +0200 | <merijn> | Maxdamantus: See aforementioned point of "I'd rather get better solutions for being polymorphic across string types" |
2022-06-22 10:58:37 +0200 | arthurs115 | (~arthurs11@163.5.10.155) |
2022-06-22 10:59:15 +0200 | <merijn> | because that will be useful for lots of other things too |
2022-06-22 10:59:21 +0200 | <Maxdamantus> | merijn: what do you mean by "trust any string"? Do you trust the string ""? |
2022-06-22 10:59:48 +0200 | <merijn> | Maxdamantus: "any string I make by combining well-formed Text will be well-formed Text" |
2022-06-22 10:59:50 +0200 | <Maxdamantus> | is "" more trustable than a string containing ill-formed UTF-8. |
2022-06-22 10:59:55 +0200 | <Maxdamantus> | s/.$/?/ |
2022-06-22 11:00:10 +0200 | emliunix | (~emliunixm@2001:470:69fc:105::2:12d1) (Quit: You have been kicked for being idle) |
2022-06-22 11:00:10 +0200 | <merijn> | Yes |
2022-06-22 11:00:27 +0200 | <merijn> | Because the former is much more well-behaved |
2022-06-22 11:00:29 +0200 | emliunix | (~emliunixm@2001:470:69fc:105::2:12d1) |
2022-06-22 11:00:50 +0200 | <Maxdamantus> | It wouldn't be more well-behaved if there's only one string type. |
2022-06-22 11:00:59 +0200 | <Maxdamantus> | The behaviours only occur when converting between encodings. |
2022-06-22 11:01:13 +0200 | <Maxdamantus> | part of the point is to avoid converting between encodings. |
2022-06-22 11:01:29 +0200 | emliunix | (~emliunixm@2001:470:69fc:105::2:12d1) () |
2022-06-22 11:01:31 +0200 | <Maxdamantus> | UTF-16 is dying out, so that shouldn't be a major concern. |
2022-06-22 11:02:02 +0200 | <Maxdamantus> | most string handling should be taking UTF-8 bytes from a network or filesystem and sending them back to the network or filesystem. |
2022-06-22 11:03:16 +0200 | <kritzefitz> | Maxdamantus: I think I mostly use Text to be able to handle unicode regardless of the underlying encoding and the sense of security merijn mentioned. For the cases that you mention, where you only retrieve UTF-8 from somewhere and only pass it back relatively unmodified, I really don't see why Text would be more convenient than ByteString. |
2022-06-22 11:03:24 +0200 | <Maxdamantus> | there might also be some awkwardness when converting to Haskell `String`s (aka, `[Char]`), but those issues already exist with `Text` |
2022-06-22 11:03:48 +0200 | <Maxdamantus> | eg, "\55296" is a possible `String`, but it can't be converted to `Text`. |
2022-06-22 11:04:01 +0200 | <merijn> | Maxdamantus: ??? |
2022-06-22 11:04:11 +0200 | <merijn> | ALL strings are by definition convertible to Text |
2022-06-22 11:04:21 +0200 | <Maxdamantus> | merijn: I'm pretty sure that one isn't. |
2022-06-22 11:04:24 +0200 | <merijn> | Why? |
2022-06-22 11:04:28 +0200 | <kritzefitz> | Also I don't think the assumption that everything is UTF-8 and you don't need to care about other encodings is valid for a general purpose language. There are tons of contexts that need to deal with all kinds of encodings and they're not likely to go away. |
2022-06-22 11:04:33 +0200 | <merijn> | > text "\55296" |
2022-06-22 11:04:34 +0200 | <lambdabot> | mueval-core: <stdout>: hPutChar: invalid argument (invalid character) |
2022-06-22 11:04:36 +0200 | <BusConscious> | merijn: As you say I would be more inclined to use ByteString, if I could use it like a normal string in an overloaded syntax and if things like Text.Parsec.ByteString had the same functionality as say Text.Parsec.String |
2022-06-22 11:04:44 +0200 | foul_owl | (~kerry@23.82.194.107) |
2022-06-22 11:04:55 +0200 | <Maxdamantus> | merijn: because it can't be encoded as a well-formed Unicode string. |
2022-06-22 11:04:58 +0200 | <merijn> | > generalCategory '\55296' |
2022-06-22 11:04:59 +0200 | <lambdabot> | Surrogate |
2022-06-22 11:05:20 +0200 | <merijn> | > text "\55296a" |
2022-06-22 11:05:22 +0200 | <lambdabot> | mueval-core: <stdout>: hPutChar: invalid argument (invalid character) |
2022-06-22 11:05:24 +0200 | <Maxdamantus> | merijn: none of the Unicode encoding forms allow that code point to be encoded (it is not a Unicode scalar value). |
2022-06-22 11:06:48 +0200 | chele | (~chele@user/chele) |
2022-06-22 11:07:15 +0200 | <kritzefitz> | merijn: Apparently `Data.Text.pack "\55296"` has the same result as `Data.Text.pack "\65533"`. |
2022-06-22 11:07:40 +0200 | <Maxdamantus> | "\65533" is U+FFFD, ie, the replacement character. |
2022-06-22 11:08:04 +0200 | <Maxdamantus> | so that means that `pack` is emitting replacement characters on error, which is the behaviour I said is reasonable earlier. |
2022-06-22 11:10:02 +0200 | benin02 | (~benin@183.82.26.120) |
2022-06-22 11:12:10 +0200 | benin0 | (~benin@183.82.26.120) (Ping timeout: 268 seconds) |
2022-06-22 11:12:10 +0200 | benin02 | benin0 |
2022-06-22 11:12:28 +0200 | <kritzefitz> | Maxdamantus: From what you said, it seems to me like we would need a new type MaybeInvalidText that preserves it's original encoding, while mostly acting like a Text. And I'm not trying to be argumentative here, but I really don't see when I gain from using it over a ByteString and I also didn't find your previous comments on that very enlightening. Can you give some example when it would do something for you ByteString can't? |
2022-06-22 11:12:30 +0200 | jmdaemon | (~jmdaemon@user/jmdaemon) (Quit: ZNC 1.8.2 - https://znc.in) |
2022-06-22 11:13:47 +0200 | <Maxdamantus> | kritzefitz: it's useful because it can become the defacto string type. I suspect there are various APIs in Haskell that treat filenames as either `ByteString`, `String` or `Text`. |
2022-06-22 11:14:04 +0200 | <Maxdamantus> | kritzefitz: preferably there would only be one type for representing filenames. |
2022-06-22 11:14:35 +0200 | <Maxdamantus> | kritzefitz: and preferably there should be no cost (development or performance-wise) when using such strings for different purposes. |
2022-06-22 11:14:37 +0200 | <kritzefitz> | Ah, ok. I guess then I don't follow, because I just don't agree on the premise, that there needs to be one defacto string type. |
2022-06-22 11:15:51 +0200 | <Maxdamantus> | If I want to write a program that scans a directory and prints the filenames to standard out, I shouldn't have to convert from the `FileName` string type to the `PuttableString` string type. |
2022-06-22 11:16:19 +0200 | <Maxdamantus> | filenames are strings, and I can print strings to standard out. |
2022-06-22 11:16:56 +0200 | <Maxdamantus> | the "hello world" program shouldn't be converting to `ByteString` just because standard out is technically a binary file.p |
2022-06-22 11:17:34 +0200 | ubert | (~Thunderbi@p200300ecdf0da56677798f1bce3bed29.dip0.t-ipconnect.de) (Remote host closed the connection) |
2022-06-22 11:17:52 +0200 | ubert | (~Thunderbi@p200300ecdf0da56600626fc30d47cd25.dip0.t-ipconnect.de) |
2022-06-22 11:17:58 +0200 | progress__ | (~fffuuuu_i@45.112.243.220) (Quit: Leaving) |
2022-06-22 11:20:10 +0200 | <kritzefitz> | I really don't see why you're so afraid of converting between things. Explicit conversion gives you the ability to actually specifiy things like what to do one errors. Having a one-size-fits-all typically leads to wrong behavior that you have no influence over. |
2022-06-22 11:20:58 +0200 | <kritzefitz> | And forcing people to actually mention the conversion explicitly forces them to think about what they're actually trying to do. Not having to think about the conversion will often mean being later surprised when it doesn't do what you intended. |
2022-06-22 11:21:22 +0200 | <Maxdamantus> | because conversion is not always possible. either information is lost (errors replaced by replacement characters), or errors are raised. |
2022-06-22 11:22:07 +0200 | <Maxdamantus> | if there's no fear in converting between things, there wouldn't be an API in Haskell that treats filenames as `ByteString`. |
2022-06-22 11:23:09 +0200 | <Maxdamantus> | https://hackage.haskell.org/package/unix-2.7.2.2/docs/System-Posix-ByteString.html#t:RawFilePath |
2022-06-22 11:23:45 +0200 | <Maxdamantus> | If there's no problem with conversion, they'd just have `type RawFilePath = Text` or `type RawFilePath = String`. |
2022-06-22 11:23:55 +0200 | <sm> | tuning in late, I'm sure it was already said, but filenames aren't strings |
2022-06-22 11:24:15 +0200 | <kritzefitz> | Using error replacements or raising errors is IMO a good thing. If you don't want that, you probably don't want Text. If you need a Text, you have to deal with the errors in some way anyway. |
2022-06-22 11:24:25 +0200 | <Maxdamantus> | They're not Haskell strings at least, but they are Unicode strings (aka, bytestrings in the case of UTF-8). |
2022-06-22 11:24:53 +0200 | sm | invokes maerwald |
2022-06-22 11:24:59 +0200 | <Maxdamantus> | sm: (earlier on I pointed out that standard Unicode strings are not necessarily well-formed, and that the standard describes UTF-8 as effectively equivalent to "bytestrings") |
2022-06-22 11:25:43 +0200 | <Maxdamantus> | anyway, it's also getting kind of late for me, need to do other stuff this evening. |
2022-06-22 11:26:11 +0200 | <sm> | very well, carry on! 👍🏻 |
2022-06-22 11:28:38 +0200 | <merijn> | Maxdamantus: Unix package has lots of questionable API design regardless, tbh |
2022-06-22 11:32:41 +0200 | jgeerds | (~jgeerds@55d45f48.access.ecotel.net) |
2022-06-22 11:40:22 +0200 | justromeon | (~justromeo@120.29.68.81) |
2022-06-22 11:41:21 +0200 | justromeon | (~justromeo@120.29.68.81) (Client Quit) |
2022-06-22 11:41:45 +0200 | justromeon | (~justromeo@120.29.68.81) |
2022-06-22 11:44:06 +0200 | yauhsien | (~yauhsien@61-231-23-53.dynamic-ip.hinet.net) (Remote host closed the connection) |
2022-06-22 11:44:29 +0200 | justromeon | (~justromeo@120.29.68.81) (Client Quit) |
2022-06-22 11:45:04 +0200 | yauhsien | (~yauhsien@61-231-23-53.dynamic-ip.hinet.net) |
2022-06-22 11:48:33 +0200 | justromeon | (~justromeo@120.29.68.81) |
2022-06-22 11:49:50 +0200 | justromeon | (~justromeo@120.29.68.81) (Client Quit) |
2022-06-22 11:49:54 +0200 | yauhsien | (~yauhsien@61-231-23-53.dynamic-ip.hinet.net) (Ping timeout: 264 seconds) |
2022-06-22 11:55:02 +0200 | ccntrq1 | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) |
2022-06-22 11:55:05 +0200 | ccntrq | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Ping timeout: 248 seconds) |
2022-06-22 11:55:05 +0200 | ccntrq1 | ccntrq |
2022-06-22 11:58:50 +0200 | lisbeths | (uid135845@id-135845.lymington.irccloud.com) |
2022-06-22 11:58:51 +0200 | ccntrq | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Remote host closed the connection) |
2022-06-22 11:59:05 +0200 | ccntrq | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) |
2022-06-22 12:02:34 +0200 | yauhsien | (~yauhsien@61-231-23-53.dynamic-ip.hinet.net) |
2022-06-22 12:03:27 +0200 | arthurs115 | (~arthurs11@163.5.10.155) (Remote host closed the connection) |
2022-06-22 12:04:27 +0200 | alp__ | (~alp@user/alp) |
2022-06-22 12:05:04 +0200 | ccntrq | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Remote host closed the connection) |
2022-06-22 12:05:49 +0200 | ccntrq | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) |
2022-06-22 12:06:55 +0200 | adanwan | (~adanwan@gateway/tor-sasl/adanwan) (Remote host closed the connection) |
2022-06-22 12:07:25 +0200 | adanwan | (~adanwan@gateway/tor-sasl/adanwan) |
2022-06-22 12:08:22 +0200 | raym | (~raym@user/raym) (Ping timeout: 244 seconds) |
2022-06-22 12:08:33 +0200 | ccntrq | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Read error: Connection reset by peer) |
2022-06-22 12:08:45 +0200 | ccntrq | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) |
2022-06-22 12:09:33 +0200 | kristjansson | (sid126207@tinside.irccloud.com) (Ping timeout: 276 seconds) |
2022-06-22 12:10:23 +0200 | raym | (~raym@user/raym) |
2022-06-22 12:11:05 +0200 | xff0x | (~xff0x@125x103x176x34.ap125.ftth.ucom.ne.jp) (Ping timeout: 248 seconds) |
2022-06-22 12:12:12 +0200 | kristjansson | (sid126207@id-126207.tinside.irccloud.com) |
2022-06-22 12:13:53 +0200 | cfricke | (~cfricke@user/cfricke) (Ping timeout: 256 seconds) |
2022-06-22 12:15:08 +0200 | ccntrq1 | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) |
2022-06-22 12:15:16 +0200 | ccntrq | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Remote host closed the connection) |
2022-06-22 12:15:18 +0200 | ccntrq1 | ccntrq |
2022-06-22 12:15:38 +0200 | Surobaki | (~surobaki@137.44.222.80) |
2022-06-22 12:18:13 +0200 | econo | (uid147250@user/econo) (Quit: Connection closed for inactivity) |
2022-06-22 12:20:18 +0200 | justromeon | (~justromeo@120.29.68.81) |
2022-06-22 12:20:48 +0200 | justromeon | (~justromeo@120.29.68.81) (Client Quit) |
2022-06-22 12:21:07 +0200 | justromeon | (~justromeo@120.29.68.81) |
2022-06-22 12:21:18 +0200 | justromeon | (~justromeo@120.29.68.81) (Client Quit) |
2022-06-22 12:21:25 +0200 | xnorfzt | (~xnorfzt@2a02:908:d88:320:b5c0:b85f:3ec0:5838) |
2022-06-22 12:21:41 +0200 | justromeon | (~justromeo@120.29.68.81) |
2022-06-22 12:22:28 +0200 | ccntrq | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Ping timeout: 268 seconds) |
2022-06-22 12:24:10 +0200 | <xnorfzt> | Hi all! I'm trying to convert a number of seconds to the difference in hours, minutes and seconds as a `(Int, Int, Int)`. I can do the math by myself using divMod, but is there an ultra-readable way using the time library? I found out that I can create `DiffTime` values with `fromIntegral`, but how do I access the resulting single components |
2022-06-22 12:24:10 +0200 | <xnorfzt> | without `format`ting the time difference? |
2022-06-22 12:24:50 +0200 | <xnorfzt> | Whoops - sorry for the broken code markup. So used to markdown... |
2022-06-22 12:25:05 +0200 | justromeon | (~justromeo@120.29.68.81) (Client Quit) |
2022-06-22 12:26:09 +0200 | ccntrq | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) |
2022-06-22 12:26:56 +0200 | cfricke | (~cfricke@user/cfricke) |
2022-06-22 12:27:35 +0200 | justromeon | (~justromeo@120.29.68.81) |
2022-06-22 12:28:02 +0200 | justromeon | (~justromeo@120.29.68.81) (Client Quit) |
2022-06-22 12:28:57 +0200 | BusConscious | (~martin@ip5f5bdedc.dynamic.kabel-deutschland.de) (Remote host closed the connection) |
2022-06-22 12:29:13 +0200 | raym | (~raym@user/raym) (Ping timeout: 248 seconds) |
2022-06-22 12:30:42 +0200 | ccntrq | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Ping timeout: 264 seconds) |
2022-06-22 12:30:42 +0200 | ccntrq1 | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) |
2022-06-22 12:30:59 +0200 | raym | (~raym@user/raym) |
2022-06-22 12:32:09 +0200 | zaquest | (~notzaques@5.130.79.72) (Remote host closed the connection) |
2022-06-22 12:33:11 +0200 | ccntrq1 | ccntrq |
2022-06-22 12:35:40 +0200 | Midjak | (~Midjak@82.66.147.146) |
2022-06-22 12:36:02 +0200 | fnurglewitz | (uid263868@id-263868.lymington.irccloud.com) |
2022-06-22 12:37:37 +0200 | chexum_ | (~quassel@gateway/tor-sasl/chexum) |
2022-06-22 12:39:50 +0200 | chexum | (~quassel@gateway/tor-sasl/chexum) (Remote host closed the connection) |
2022-06-22 12:40:46 +0200 | ccntrq | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Remote host closed the connection) |
2022-06-22 12:41:03 +0200 | ccntrq | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) |
2022-06-22 12:42:50 +0200 | chexum_ | (~quassel@gateway/tor-sasl/chexum) (Ping timeout: 268 seconds) |
2022-06-22 12:46:47 +0200 | chexum | (~quassel@gateway/tor-sasl/chexum) |
2022-06-22 12:48:23 +0200 | azimut | (~azimut@gateway/tor-sasl/azimut) (Ping timeout: 268 seconds) |
2022-06-22 12:48:57 +0200 | merijn | (~merijn@c-001-001-018.client.esciencecenter.eduvpn.nl) (Ping timeout: 248 seconds) |
2022-06-22 12:51:07 +0200 | zaquest | (~notzaques@5.130.79.72) |
2022-06-22 12:52:50 +0200 | azimut | (~azimut@gateway/tor-sasl/azimut) |
2022-06-22 12:58:14 +0200 | <tomsmeding> | to be honest if you want the most _readable_ option, I vote for \n -> (n `div` 3600, n `div` 60 `mod` 60, n `mod` 60) |
2022-06-22 12:58:28 +0200 | chomwitt | (~chomwitt@2a02:587:dc0d:e600:1174:892d:39e3:5e01) (Quit: Leaving) |
2022-06-22 12:58:33 +0200 | <tomsmeding> | since 'time' doesn't seem to have a dedicated function for this |
2022-06-22 12:59:09 +0200 | Henkru | (henkru@kapsi.fi) |
2022-06-22 13:00:56 +0200 | ccntrq | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Remote host closed the connection) |
2022-06-22 13:01:19 +0200 | <int-e> | > (\case ts | (tm, s) <- ts `divMod` 60, (th, m) <- tm `divMod` 60 -> (th,m,s)) 4242 -- scnr |
2022-06-22 13:01:21 +0200 | <lambdabot> | (1,10,42) |
2022-06-22 13:01:51 +0200 | ccntrq | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) |
2022-06-22 13:02:41 +0200 | xff0x | (~xff0x@b133147.ppp.asahi-net.or.jp) |
2022-06-22 13:04:00 +0200 | <tomsmeding> | int-e: why a lambdacase instead of \ts -> let (tm, s) = ts `divMod` 60 ; ... |
2022-06-22 13:04:00 +0200 | xnorfzt | thinks about time-lens :D |
2022-06-22 13:04:16 +0200 | ccntrq | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Remote host closed the connection) |
2022-06-22 13:04:32 +0200 | <tomsmeding> | nice variable names, though |
2022-06-22 13:04:49 +0200 | <int-e> | tomsmeding: because then I wouldn't get to (ab)use pattern guards |
2022-06-22 13:05:03 +0200 | <tomsmeding> | why are pattern guards better than simple let clauses in this case :p |
2022-06-22 13:05:04 +0200 | <int-e> | I wanted to call them all t. |
2022-06-22 13:05:12 +0200 | <tomsmeding> | right |
2022-06-22 13:05:19 +0200 | <xnorfzt> | tomsmeding int-e - makes sense, it's pretty short and readable, but not what I'm looking for. <3 |
2022-06-22 13:05:56 +0200 | ccntrq | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) |
2022-06-22 13:07:46 +0200 | lyle | (~lyle@104.246.145.85) |
2022-06-22 13:08:43 +0200 | yauhsien | (~yauhsien@61-231-23-53.dynamic-ip.hinet.net) (Remote host closed the connection) |
2022-06-22 13:09:18 +0200 | yauhsien | (~yauhsien@61-231-23-53.dynamic-ip.hinet.net) |
2022-06-22 13:10:51 +0200 | xnorfzt | (~xnorfzt@2a02:908:d88:320:b5c0:b85f:3ec0:5838) (Quit: xnorfzt) |
2022-06-22 13:12:30 +0200 | fryguybob | (~fryguybob@cpe-74-67-169-145.rochester.res.rr.com) (Quit: leaving) |
2022-06-22 13:12:37 +0200 | sympt | (~sympt@user/sympt) (Read error: Connection reset by peer) |
2022-06-22 13:13:45 +0200 | sympt | (~sympt@user/sympt) |
2022-06-22 13:13:57 +0200 | yauhsien | (~yauhsien@61-231-23-53.dynamic-ip.hinet.net) (Ping timeout: 256 seconds) |
2022-06-22 13:15:02 +0200 | merijn | (~merijn@c-001-001-018.client.esciencecenter.eduvpn.nl) |
2022-06-22 13:15:47 +0200 | yauhsien | (~yauhsien@61-231-23-53.dynamic-ip.hinet.net) |
2022-06-22 13:24:11 +0200 | geekosaur | (~geekosaur@xmonad/geekosaur) (Read error: Connection reset by peer) |
2022-06-22 13:24:18 +0200 | allbery_b | (~geekosaur@xmonad/geekosaur) |
2022-06-22 13:24:21 +0200 | allbery_b | geekosaur |
2022-06-22 13:27:46 +0200 | Surobaki | (~surobaki@137.44.222.80) (Quit: Leaving) |
2022-06-22 13:28:12 +0200 | odnes | (~odnes@5-203-220-108.pat.nym.cosmote.net) |
2022-06-22 13:30:23 +0200 | Surobaki | (~surobaki@137.44.222.80) |
2022-06-22 13:30:41 +0200 | coot | (~coot@213.134.190.95) (Quit: coot) |
2022-06-22 13:34:16 +0200 | coot | (~coot@213.134.190.95) |
2022-06-22 13:35:36 +0200 | dschrempf | (~dominik@070-207.dynamic.dsl.fonira.net) |
2022-06-22 13:39:01 +0200 | coot | (~coot@213.134.190.95) (Client Quit) |
2022-06-22 13:40:07 +0200 | <maerwald> | Maxdamantus: https://github.com/haskellfoundation/tech-proposals/issues/35 |
2022-06-22 13:42:26 +0200 | rendar | (~Paxman@user/rendar) |
2022-06-22 13:42:39 +0200 | haritzondo | (~hrtz@82-69-11-11.dsl.in-addr.zen.co.uk) (Changing host) |
2022-06-22 13:42:39 +0200 | haritzondo | (~hrtz@user/haritz) |
2022-06-22 13:42:48 +0200 | haritzondo | haritz |
2022-06-22 13:48:31 +0200 | geekosaur | (~geekosaur@xmonad/geekosaur) (Ping timeout: 256 seconds) |
2022-06-22 13:49:29 +0200 | geekosaur | (~geekosaur@xmonad/geekosaur) |
2022-06-22 13:50:02 +0200 | benin0 | (~benin@183.82.26.120) (Quit: The Lounge - https://thelounge.chat) |
2022-06-22 13:50:35 +0200 | <Maxdamantus> | maerwald: hm, seems a lot more complicated/tedious than just using a string type capable of handling any byte sequence, where WTF-8 would be used for handling filenames. |
2022-06-22 13:51:01 +0200 | <maerwald> | Maxdamantus: I thought about using WTF-8, but I don't like it |
2022-06-22 13:51:22 +0200 | <maerwald> | I'm not sure you can easily reconstruct underlying encoding information from WTF-8... it would be complicated |
2022-06-22 13:51:23 +0200 | <Maxdamantus> | and yeah, the filenames thing is just an obvious example. The same thing applies to simply reading text from files. |
2022-06-22 13:51:40 +0200 | <maerwald> | the idea is to stop messing with the data that syscalls return |
2022-06-22 13:52:05 +0200 | dsrt^ | (~dsrt@50.237.44.186) |
2022-06-22 13:53:37 +0200 | <Maxdamantus> | (reading text from files should be a simpler problem because files are at least still [Word8] on Windows, rather than [Word16]) |
2022-06-22 13:54:27 +0200 | <Maxdamantus> | imo Windows' use of UTF-16 shouldn't be a reason to complicate the API for other platforms. |
2022-06-22 13:55:03 +0200 | <Maxdamantus> | WTF-8 is slightly ugly, but it's only used to address an ugly API that might be obsolete soon anyway. |
2022-06-22 13:55:08 +0200 | <maerwald> | Maxdamantus: how is the API more complicated? These details are hidden behind a newtype |
2022-06-22 13:55:53 +0200 | <Maxdamantus> | maerwald: how do you print a filename to standard out? |
2022-06-22 13:55:56 +0200 | raehik | (~raehik@cpc95906-rdng25-2-0-cust156.15-3.cable.virginm.net) |
2022-06-22 13:56:12 +0200 | <maerwald> | putStr filepath |
2022-06-22 13:56:12 +0200 | <Maxdamantus> | pcesumably involves a conversion. |
2022-06-22 13:57:01 +0200 | <hpc> | you have to pick an encoding anyway on linux, since it's not specified |
2022-06-22 13:57:12 +0200 | jgeerds | (~jgeerds@55d45f48.access.ecotel.net) (Ping timeout: 248 seconds) |
2022-06-22 13:57:49 +0200 | <Maxdamantus> | maerwald: so filepath is still a `String`? |
2022-06-22 13:57:53 +0200 | <maerwald> | Maxdamantus: no |
2022-06-22 13:58:27 +0200 | <Maxdamantus> | so `putStr` is made polymorphic over things that are like strings? |
2022-06-22 13:58:50 +0200 | <merijn> | I wonder if this discussion will answer my questions vis-a-vis unstoppable forces and immovable objects :) |
2022-06-22 13:58:59 +0200 | <hpc> | it's not making the api more complicated, it's making it more accurate |
2022-06-22 13:59:02 +0200 | <maerwald> | Maxdamantus: sorry, I meant `print` |
2022-06-22 13:59:14 +0200 | <hpc> | the current api is simple in the same way javascript is simple |
2022-06-22 14:00:06 +0200 | <hpc> | right now, (putStr filepath) is complicated to the programmer in ridiculous ways |
2022-06-22 14:00:14 +0200 | <Maxdamantus> | hpc: JavaScript's API is simple and accurate as long as you only deal with Windows filename APIs. |
2022-06-22 14:00:24 +0200 | <hpc> | on windows it just works, because somewhere there was magic to convert from utf-16 |
2022-06-22 14:00:35 +0200 | <hpc> | on linux you have to hope and pray, because filenames are just bytes |
2022-06-22 14:01:02 +0200 | <maerwald> | hpc: well, on windows, you also may have invalid UTF-16 |
2022-06-22 14:01:10 +0200 | <maerwald> | the encoding is in fact UCS-2 |
2022-06-22 14:01:19 +0200 | <hpc> | ugh, right |
2022-06-22 14:01:19 +0200 | <maerwald> | so you can have invalid surrogate pairs |
2022-06-22 14:02:00 +0200 | <Maxdamantus> | The fact that JS and windows are based around 16-bit strings is just a historical oddity. Going forward, the preference should be 8-bit strings, and we can use WTF-8 for backwards compatibility. Don't need to complicate the APIs for backwards-compatibility. |
2022-06-22 14:02:11 +0200 | yrlnry | (~yrlnry@pool-108-2-150-109.phlapa.fios.verizon.net) |
2022-06-22 14:02:38 +0200 | <merijn> | maerwald: Pretty sure it's proper utf-16 now? |
2022-06-22 14:02:49 +0200 | <merijn> | not 100% sure though |
2022-06-22 14:02:53 +0200 | <Maxdamantus> | merijn: pretty sure it's not. |
2022-06-22 14:02:57 +0200 | <maerwald> | merijn: no, you can easily create filepaths via the system API that are not UTF-16 |
2022-06-22 14:03:11 +0200 | <Maxdamantus> | unless you're talking about Windows 11. Haven't tested that personally. |
2022-06-22 14:03:37 +0200 | <Maxdamantus> | (I've certainly experimented with this stuff on Windows 10) |
2022-06-22 14:04:16 +0200 | <hpc> | wtf-8 doesn't solve the fact that linux filenames are bytes either |
2022-06-22 14:04:29 +0200 | <maerwald> | hpc: but it never fails, right? |
2022-06-22 14:04:54 +0200 | <maerwald> | haven't tested how it behaves in detail |
2022-06-22 14:05:10 +0200 | nate4 | (~nate@98.45.169.16) |
2022-06-22 14:05:11 +0200 | <maerwald> | current conversion in base can fail for encoding that are not superset of ascii |
2022-06-22 14:05:37 +0200 | <Maxdamantus> | hpc: WTF-8 is just there to solve the Windows problem. There is no problem with translating paths to byte strings on Linux. |
2022-06-22 14:05:37 +0200 | <maerwald> | such as some korean encodings afair |
2022-06-22 14:05:57 +0200 | <maerwald> | Maxdamantus: there is, because roundtripping isn't always defined |
2022-06-22 14:06:25 +0200 | <maerwald> | see https://hackage.haskell.org/package/base-4.16.1.0/docs/GHC-IO-Encoding.html#v:mkTextEncoding |
2022-06-22 14:07:05 +0200 | <maerwald> | the other issue is that most APIs assume that the filepaths you're consuming correspond to the current locale... which is... uhm, dumb |
2022-06-22 14:07:32 +0200 | <Maxdamantus> | maerwald: by "path" I mean as used by the OS, not as used by current Haskell. |
2022-06-22 14:07:57 +0200 | <maerwald> | Maxdamantus: I don't understand that statement then |
2022-06-22 14:08:00 +0200 | <Maxdamantus> | Haskell's APIs inaccurately represent paths as `String`. |
2022-06-22 14:09:05 +0200 | <maerwald> | system APIs don't interpret encoding... things like path separators '/' are defined accurately (byte in the ascii set) and can be scanned for regardless of the actualy filename encoding |
2022-06-22 14:09:19 +0200 | <Maxdamantus> | maerwald: paths in Linux are already just sequences of bytes, so if a programming language defined a string as a sequence of bytes, there's a no-op mapping between paths and strings. |
2022-06-22 14:10:01 +0200 | nate4 | (~nate@98.45.169.16) (Ping timeout: 248 seconds) |
2022-06-22 14:10:13 +0200 | <maerwald> | Maxdamantus: yes |
2022-06-22 14:10:19 +0200 | <maerwald> | that's what the new API does |
2022-06-22 14:10:42 +0200 | <maerwald> | there's literally no encoding/decoding |
2022-06-22 14:10:49 +0200 | <maerwald> | unless you want to get a Haskell String |
2022-06-22 14:12:39 +0200 | <Maxdamantus> | maerwald: but it's complicated because it introduces a new string type, which only exists because of a Windows API that might be being obsoleted. |
2022-06-22 14:12:53 +0200 | <maerwald> | Maxdamantus: no, it uses an exsiting string type |
2022-06-22 14:13:01 +0200 | <maerwald> | ShortByteString |
2022-06-22 14:13:02 +0200 | <Maxdamantus> | Which one? |
2022-06-22 14:13:09 +0200 | <Maxdamantus> | Hm. |
2022-06-22 14:14:02 +0200 | <Maxdamantus> | So do you get different `ShortString` values depending on whether the path was read on Windows/Linux? |
2022-06-22 14:14:25 +0200 | <maerwald> | yes, on windows you will have UCS-2LE bytestrings that contain \NUL bytes |
2022-06-22 14:14:27 +0200 | <Maxdamantus> | eg, for a filename that looks like "àéíóú"? |
2022-06-22 14:15:20 +0200 | <Maxdamantus> | and what, the `Show` instance converts differently depending on the OS? |
2022-06-22 14:15:41 +0200 | <maerwald> | Maxdamantus: yes... there's some tradeoff for the Show instance, because we have to convert to String |
2022-06-22 14:16:02 +0200 | <maerwald> | you can't define a total function that doesn't lose information and converts to String |
2022-06-22 14:16:18 +0200 | dsrt^ | (~dsrt@50.237.44.186) (Ping timeout: 264 seconds) |
2022-06-22 14:17:36 +0200 | <Maxdamantus> | Sure, so there should be a de facto string type capable of handling practically all Unicode strings (except UTF-32 ones) |
2022-06-22 14:18:17 +0200 | <Maxdamantus> | that type would be equivalent ho `ByteString`, where WTF-8 is used for possibly ill-formed UTF-16. |
2022-06-22 14:19:00 +0200 | <maerwald> | the cool thing with this approach compared to WTF-8 is that you could easily use something like this https://hackage.haskell.org/package/charsetdetect-ae-1.1.0.4/docs/Codec-Text-Detect.html on the raw bytes |
2022-06-22 14:19:05 +0200 | <maerwald> | because we're not changing anything |
2022-06-22 14:19:44 +0200 | <maerwald> | Maxdamantus: I'm open to suggestions on how to handle the Show instances |
2022-06-22 14:20:21 +0200 | <Maxdamantus> | I'm not sure you can use that on a `ShornString` |
2022-06-22 14:20:39 +0200 | <maerwald> | https://hackage.haskell.org/package/filepath-2.0.0.3/candidate/docs/src/System.OsString.Internal.T… |
2022-06-22 14:20:42 +0200 | <Maxdamantus> | Unless it has surrogate code units in it, it could always be UTF-16. |
2022-06-22 14:21:41 +0200 | <Maxdamantus> | ie, if you read a filename into a `ShortString`, how does a detector know it's not UTF-16? |
2022-06-22 14:22:11 +0200 | <maerwald> | one way is to just convert Word8 to Char, but then you get garbled crap for most things... and the Show instance isn't really for serialization |
2022-06-22 14:22:38 +0200 | <maerwald> | Maxdamantus: see here https://hackage.haskell.org/package/filepath-2.0.0.3/candidate/docs/System-AbstractFilePath.html#g:3 |
2022-06-22 14:22:44 +0200 | <maerwald> | there are 3 functions for conversion |
2022-06-22 14:23:14 +0200 | <maerwald> | one that assumes Utf-8/UTF-16, one that allows to specify the encoding and one that looks up the filesystem encoding... all of them can fail |
2022-06-22 14:23:28 +0200 | waleee | (~waleee@2001:9b0:213:7200:cc36:a556:b1e8:b340) |
2022-06-22 14:24:34 +0200 | jao | (~jao@cpc103048-sgyl39-2-0-cust502.18-2.cable.virginm.net) |
2022-06-22 14:29:30 +0200 | <Maxdamantus> | Hm, so it's dependent on the OS. |
2022-06-22 14:29:48 +0200 | <Maxdamantus> | What happens when Windows starts offering bytestring-based filenames? |
2022-06-22 14:29:50 +0200 | <maerwald> | Maxdamantus: well, you could specify WTF-8 for both platforms |
2022-06-22 14:30:14 +0200 | <maerwald> | toAbstractFilePath wtf8 wtf8 fp |
2022-06-22 14:30:27 +0200 | <maerwald> | *toAbstractFilePathEnc |
2022-06-22 14:30:54 +0200 | <maerwald> | Maxdamantus: what do you mean? |
2022-06-22 14:31:20 +0200 | <maerwald> | filepaths on windows are already 'wchar_t*' |
2022-06-22 14:31:30 +0200 | alp_ | (~alp@user/alp) |
2022-06-22 14:31:42 +0200 | <Maxdamantus> | maerwald: if Windows in the future deprecates use of its 16-bit APIs and offers 8-bit APIs instead, where old filenames are transparently converted to WTF-8. |
2022-06-22 14:31:52 +0200 | <maerwald> | Maxdamantus: it will not deprecate that ever |
2022-06-22 14:32:01 +0200 | <maerwald> | windows cares about backwards compat |
2022-06-22 14:32:08 +0200 | <Maxdamantus> | WTF-8 is backwards-compatible. |
2022-06-22 14:32:16 +0200 | <maerwald> | I'm talking about windows |
2022-06-22 14:32:20 +0200 | <Maxdamantus> | So am I. |
2022-06-22 14:32:22 +0200 | <maerwald> | WTF-8 is a rust specific thing |
2022-06-22 14:32:27 +0200 | <maerwald> | has nothing to do with windows |
2022-06-22 14:32:41 +0200 | <Maxdamantus> | Windows could adopt it as part of a migration strategy to 8-bit filenames. |
2022-06-22 14:32:45 +0200 | <maerwald> | windows will contain to provide wide character versions of their system API |
2022-06-22 14:32:59 +0200 | <maerwald> | see https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilew |
2022-06-22 14:33:10 +0200 | <maerwald> | CreateFileW stands for *wide character* |
2022-06-22 14:33:15 +0200 | <maerwald> | it will not change its semantics |
2022-06-22 14:33:57 +0200 | alp__ | (~alp@user/alp) (Read error: Connection reset by peer) |
2022-06-22 14:34:57 +0200 | <Maxdamantus> | Their current filesystems specifically support 16-bit strings, but it seems plausible that they might move away from that and just use 8-bit strings (cf. macOS). The new APIs could support old NTFS/FAT32 filenames still by transparently converting to WTF-8 (or something equivalent, but at this point there's no point in reinventing WTF-8). |
2022-06-22 14:35:08 +0200 | <maerwald> | no, it doesn't seem plausible |
2022-06-22 14:35:16 +0200 | [itchyjunk] | (~itchyjunk@user/itchyjunk/x-7353470) |
2022-06-22 14:35:23 +0200 | <maerwald> | windows doesn't randomly break API |
2022-06-22 14:35:36 +0200 | <maerwald> | that's why they still haven't migrated to UTF-16, but still support UCS-2 |
2022-06-22 14:35:44 +0200 | <maerwald> | after decades |
2022-06-22 14:35:49 +0200 | <Maxdamantus> | What API is being broken? |
2022-06-22 14:38:31 +0200 | <maerwald> | I don't understand what you're suggesting then. Wide character API works for all existing versions of windows. There's already an ANSI API that allows to configure stuff for UTF-8: https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilea |
2022-06-22 14:38:43 +0200 | <maerwald> | https://docs.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page |
2022-06-22 14:38:52 +0200 | <Maxdamantus> | aiui Windows has been gradually migrating things from UCS-2 to either UTF-8 or bytes, though I'm not sure about the details. |
2022-06-22 14:38:52 +0200 | <maerwald> | but that isn't supported across all windows versions |
2022-06-22 14:38:58 +0200 | <maerwald> | so that is not a good default |
2022-06-22 14:39:07 +0200 | dsrt^ | (~dsrt@50.237.44.186) |
2022-06-22 14:39:14 +0200 | fweht | (uid404746@id-404746.lymington.irccloud.com) |
2022-06-22 14:39:45 +0200 | <maerwald> | And all that doesn't matter to us. What matters is what the Win32 bindings use, and they use the wide character API: https://hackage.haskell.org/package/Win32 |
2022-06-22 14:42:12 +0200 | <Maxdamantus> | Right, but what happens if Windows starts supporting bytes-based filenames? Haskell should be able to switch over to the new API in order to handle them, but it's going to be awkward to do that if doing so means changing all of the `Show` behaviour etc for Windows users. |
2022-06-22 14:42:53 +0200 | <Maxdamantus> | eg, Windows has been adding integration for WSL. I think they're intending on running Android apps etc. |
2022-06-22 14:43:37 +0200 | <maerwald> | Maxdamantus: I don't think Win32 package will migrate to anything else. It will stick to wide character API. |
2022-06-22 14:43:58 +0200 | <Maxdamantus> | theoretically they could decide at some point to offer 8-bit filename APIs which are able to handle Linux filesystems without information loss, and they should also be fully capable of handling existing NTFS filesystems without information loss due to conversion to/from WTF-8. |
2022-06-22 14:44:13 +0200 | <Maxdamantus> | maerwald: I'm not saying they're going to remove the APIs. |
2022-06-22 14:44:51 +0200 | <Maxdamantus> | maerwald: just offer better ones that are usable in all the same cases as the current 16-bit ones, but also handle filenames from 8-bit systems, like WSL or network shares. |
2022-06-22 14:45:10 +0200 | <maerwald> | Maxdamantus: you're going to break Haskell for old windows versions then |
2022-06-22 14:46:23 +0200 | <Maxdamantus> | maerwald: you mean because Haskell has to pick to use either the new API (only supports Windows 13+) or the old API (supports all versions of Windows)? |
2022-06-22 14:46:32 +0200 | <Maxdamantus> | why can't it support both? |
2022-06-22 14:47:31 +0200 | <maerwald> | I don't understand what problem you're trying to solve. Of course it can provide bindings for both variants, but on some windows systems the UTF-8 one will *fail*. |
2022-06-22 14:48:08 +0200 | <Maxdamantus> | It will only fail when creating filenames that are unsupported on a filesystem, but that's already a possibility on Windows. |
2022-06-22 14:48:13 +0200 | <Maxdamantus> | eg, can't create a file called "con" |
2022-06-22 14:48:33 +0200 | <geekosaur> | uh |
2022-06-22 14:48:37 +0200 | <Maxdamantus> | or a file with some special characters in it, can't think of what they are off the top of my head. |
2022-06-22 14:48:56 +0200 | <maerwald> | "As of Windows Version 1903 (May 2019 Update), you can use the ActiveCodePage property in the appxmanifest for packaged apps, or the fusion manifest for unpackaged apps, to force a process to use UTF-8 as the process code page." |
2022-06-22 14:49:09 +0200 | <geekosaur> | so, wide vs. narrow characters are just a bit more intrusive than that |
2022-06-22 14:49:24 +0200 | <Maxdamantus> | anyway, need to go to bed. |
2022-06-22 14:49:26 +0200 | <Maxdamantus> | Thu Jun 23 12:49:26 AM NZST 2022 |
2022-06-22 14:50:18 +0200 | <maerwald> | 1. it requires configuration, 2. it doesn't work on all windows versions, 3. it complicates filepath handling |
2022-06-22 14:50:20 +0200 | <maerwald> | what's the gain |
2022-06-22 14:59:55 +0200 | dsrt^ | (~dsrt@50.237.44.186) (Ping timeout: 256 seconds) |
2022-06-22 15:00:30 +0200 | gurkenglas | (~gurkengla@dslb-002-207-014-022.002.207.pools.vodafone-ip.de) (Ping timeout: 276 seconds) |
2022-06-22 15:01:48 +0200 | ridcully | (~ridcully@pd951ff85.dip0.t-ipconnect.de) (Ping timeout: 276 seconds) |
2022-06-22 15:03:38 +0200 | juri__ | (~juri@79.140.115.124) (Ping timeout: 240 seconds) |
2022-06-22 15:03:42 +0200 | renzhi | (~xp@2607:fa49:6500:b100::f64a) (Ping timeout: 264 seconds) |
2022-06-22 15:04:08 +0200 | mrd | (~mrd@user/mrd) |
2022-06-22 15:04:10 +0200 | dschrempf | (~dominik@070-207.dynamic.dsl.fonira.net) (Quit: WeeChat 3.5) |
2022-06-22 15:09:12 +0200 | ChaiTRex | (~ChaiTRex@user/chaitrex) (Remote host closed the connection) |
2022-06-22 15:10:54 +0200 | kuribas | (~user@ip-188-118-57-242.reverse.destiny.be) (Ping timeout: 264 seconds) |
2022-06-22 15:12:16 +0200 | pleo | (~pleo@user/pleo) |
2022-06-22 15:14:50 +0200 | odnes | (~odnes@5-203-220-108.pat.nym.cosmote.net) (Remote host closed the connection) |
2022-06-22 15:15:09 +0200 | chexum | (~quassel@gateway/tor-sasl/chexum) (Ping timeout: 268 seconds) |
2022-06-22 15:15:12 +0200 | odnes | (~odnes@5-203-220-108.pat.nym.cosmote.net) |
2022-06-22 15:15:16 +0200 | ChaiTRex | (~ChaiTRex@user/chaitrex) |
2022-06-22 15:17:30 +0200 | chexum | (~quassel@gateway/tor-sasl/chexum) |
2022-06-22 15:18:26 +0200 | zebrag | (~chris@user/zebrag) |
2022-06-22 15:19:06 +0200 | shriekingnoise | (~shrieking@201.212.175.181) |
2022-06-22 15:26:34 +0200 | Unicorn_Princess | (~Unicorn_P@93-103-228-248.dynamic.t-2.net) |
2022-06-22 15:28:05 +0200 | toluene | (~toluene@user/toulene) |
2022-06-22 15:28:12 +0200 | Infinite | (~Infinite@49.39.125.113) |
2022-06-22 15:29:48 +0200 | dsrt^ | (~dsrt@50.237.44.186) |
2022-06-22 15:30:33 +0200 | juri_ | (~juri@79.140.115.124) |
2022-06-22 15:34:21 +0200 | crazazy | (~user@130.89.171.62) |
2022-06-22 15:36:06 +0200 | vysn | (~vysn@user/vysn) (Ping timeout: 264 seconds) |
2022-06-22 15:37:29 +0200 | cfricke | (~cfricke@user/cfricke) (Ping timeout: 248 seconds) |
2022-06-22 15:37:29 +0200 | k` | (~user@152.1.137.158) |
2022-06-22 15:39:23 +0200 | <k`> | How do I depend on a github repo in my cabal file? Currently I've written a 'source-repository' stanza for it, but cabal fails with 'unknown pagkage'. |
2022-06-22 15:39:35 +0200 | <merijn> | k`: You don't |
2022-06-22 15:39:47 +0200 | <merijn> | k`: You probably want a cabal.project file |
2022-06-22 15:40:16 +0200 | <k`> | merijn: Thanks, I'll open up the docs on that. |
2022-06-22 15:41:15 +0200 | <k`> | Do I use one cabal.project file for the entire project, or do I put one in each package subdir? |
2022-06-22 15:42:25 +0200 | <sm> | one for the project |
2022-06-22 15:42:52 +0200 | dsrt^ | (~dsrt@50.237.44.186) (Remote host closed the connection) |
2022-06-22 15:42:59 +0200 | <k`> | sm: Thanks. That's what the name seems to imply but I didn't want to make any foolish assumptions. |
2022-06-22 15:43:04 +0200 | <merijn> | k`: So, the distinction is: a .cabal is a standalone description of a specific package (dependencies, flags, etc.) "cabal.project" is for defining the context in which a project (of one or more packages) is being used/built and allows you to override things (like saying to use a local directory or git repo to develop against unreleased code) |
2022-06-22 15:43:50 +0200 | eod|fserucas_ | (~eod|fseru@193.65.114.89.rev.vodafone.pt) (Quit: Leaving) |
2022-06-22 15:44:06 +0200 | <k`> | So just out of curiosity, how would the individual packages be built when they don't have access to the overall project description? |
2022-06-22 15:45:04 +0200 | <sm> | there doesn't seem to be an introduction to cabal.project in the user guide |
2022-06-22 15:45:25 +0200 | <merijn> | k`: The idea is that individual package (when you release them) only depend on other released packages/versions, not git repos |
2022-06-22 15:45:31 +0200 | <merijn> | sm: There was a WIP to write one |
2022-06-22 15:46:07 +0200 | <jackdk> | the reference is at least thorough, but I'm not aware of any good intros: https://cabal.readthedocs.io/en/3.6/cabal-project.html |
2022-06-22 15:46:13 +0200 | <sclv> | https://cabal.readthedocs.io/en/3.6/cabal-project.html and https://cabal.readthedocs.io/en/3.6/nix-local-build.html#developing-multiple-packages |
2022-06-22 15:46:24 +0200 | <sclv> | the latter of the two i posted is sort of an intro |
2022-06-22 15:46:46 +0200 | <k`> | So, say I have packages 'foo-class', 'foo-pattern', and 'foo-type', with 'foo-pattern' and 'foo-type' both depending on 'foo-class'. Where do I give the repo for 'foo-class' in 'foo-type' so that 'foo-type' can be built independently? |
2022-06-22 15:47:10 +0200 | <sclv> | the packages all are like normal packages. the project file ties them all together |
2022-06-22 15:47:20 +0200 | <merijn> | k`: The idea would be that, eventually foo-class gets released on hackage |
2022-06-22 15:47:38 +0200 | <merijn> | k`: Basically "where to find a package" is NOT something .cabal files are concerned with |
2022-06-22 15:47:45 +0200 | <merijn> | They merely state "what package" |
2022-06-22 15:47:49 +0200 | <merijn> | (and version) |
2022-06-22 15:48:16 +0200 | <merijn> | k`: The implicit context is that "where" is "the package repository (aka Hackage instance) that you happen to point cabal-install at" |
2022-06-22 15:48:26 +0200 | <k`> | merijn: Fair enough. I'm just always hesitant to release anything to Hackage because my code quality is shit. |
2022-06-22 15:48:47 +0200 | <k`> | But I still want to make things properly modular. |
2022-06-22 15:48:53 +0200 | <sclv> | a hackage release is a package tarball. those don't include cabal.project files |
2022-06-22 15:48:56 +0200 | <merijn> | k`: You can run your own hackage and cabal-install can be pointed at a different (and even multiple!) hackages :) |
2022-06-22 15:49:07 +0200 | <sclv> | cabal.project files are _only_ for use in developing a collection of packages from a repo |
2022-06-22 15:49:15 +0200 | <merijn> | k`: So it's perfectly possible to have a personal/company/whatever Hackage repo |
2022-06-22 15:49:28 +0200 | <merijn> | sclv: Not *only* for that |
2022-06-22 15:49:29 +0200 | <sclv> | once you upload to hackage, you should ensure all the deps are already on hackage |
2022-06-22 15:50:13 +0200 | coot | (~coot@213.134.190.95) |
2022-06-22 15:50:22 +0200 | motherfsck | (~motherfsc@user/motherfsck) (Quit: quit) |
2022-06-22 15:51:05 +0200 | <sm> | ah there it is, https://cabal.readthedocs.io/en/3.6/nix-local-build.html#developing-multiple-packages . The cursed "nix-style" jargon strikes again |
2022-06-22 15:51:50 +0200 | cfricke | (~cfricke@user/cfricke) |
2022-06-22 15:52:07 +0200 | motherfsck | (~motherfsc@user/motherfsck) |
2022-06-22 15:52:10 +0200 | <k`> | Anyone know what happens when I list a source-repository-package that points to a subdir of a project that uses cabal.project to build? Just fails to build? |
2022-06-22 15:52:42 +0200 | <sm> | (I did search the site for "cabal.project", must have missed the Quickstart) |
2022-06-22 15:52:42 +0200 | <merijn> | k`: What do you mean? |
2022-06-22 15:53:24 +0200 | <merijn> | k`: If you have a cabal.project in a directory for project X, then X and all its dependencies should be findable via that 1 cabal.project |
2022-06-22 15:53:57 +0200 | <merijn> | k`: if you meant "X depends on Y and Y uses cabal.project to find Z", then cabal.project for X needs to include repo pointers for both Y and Z |
2022-06-22 15:54:35 +0200 | <k`> | merijn: Oh, that is good to know. |
2022-06-22 15:54:43 +0200 | <k`> | Would not have expected that. |
2022-06-22 15:56:04 +0200 | <k`> | So if I'm trying to modularize with multiple packages I should throw them all in one huge package repo so they can all find their dependencies, and I don't need to update the cabal.project of Z when a new transitive dependency is added. |
2022-06-22 15:56:35 +0200 | <merijn> | k`: If they're interdependent then I would say yes |
2022-06-22 15:56:48 +0200 | <sclv> | that's a common pattern, yes |
2022-06-22 15:56:51 +0200 | <merijn> | k`: See for example: https://github.com/merijn/broadcast-chan |
2022-06-22 15:57:38 +0200 | <merijn> | k`: Although once you get past, say, 10 packages in the same repo I would start questioning what I'm doing if I were you ;) |
2022-06-22 15:57:42 +0200 | gmg | (~user@user/gehmehgeh) (Ping timeout: 268 seconds) |
2022-06-22 15:58:56 +0200 | <k`> | I see that there you give the direct paths to the subdirectories. Is it standard to do that rather than round tripping through github? |
2022-06-22 15:59:33 +0200 | <merijn> | k`: Yes, pointing at github will use whatever is currently on github *NOT* what you have in your local clone |
2022-06-22 15:59:34 +0200 | vglfr | (~vglfr@coupling.penchant.volia.net) (Read error: Connection reset by peer) |
2022-06-22 15:59:41 +0200 | gmg | (~user@user/gehmehgeh) |
2022-06-22 15:59:45 +0200 | vglfr | (~vglfr@coupling.penchant.volia.net) |
2022-06-22 15:59:50 +0200 | <merijn> | k`: Whereas the subdirectories tell it to use whatever is in the local subdirectories *right now* |
2022-06-22 15:59:56 +0200 | <merijn> | Which is probably what you want |
2022-06-22 16:00:35 +0200 | <merijn> | (because if your locally changing foo-class, you probably want local versions of foo-instance to pick that up :p) |
2022-06-22 16:01:40 +0200 | Guest59 | (~Guest59@148.253.134.213) |
2022-06-22 16:02:10 +0200 | Guest59 | (~Guest59@148.253.134.213) (Client Quit) |
2022-06-22 16:02:20 +0200 | Infinite | (~Infinite@49.39.125.113) (Quit: Client closed) |
2022-06-22 16:02:41 +0200 | Infinite | (~Infinite@49.39.125.113) |
2022-06-22 16:02:55 +0200 | mecharyuujin | (~mecharyuu@2409:4050:ece:7592:439d:86ae:5a53:fec7) |
2022-06-22 16:03:00 +0200 | pleo | (~pleo@user/pleo) (Quit: quit) |
2022-06-22 16:04:44 +0200 | <mecharyuujin> | Heya, beginner here, am learning Haskell using the Learn You a Haskell tutorial |
2022-06-22 16:05:04 +0200 | <mecharyuujin> | head'' :: [a] -> a |
2022-06-22 16:05:11 +0200 | <mecharyuujin> | head'' = foldr1 (\x _ -> x) |
2022-06-22 16:05:26 +0200 | <mecharyuujin> | why does this version of head work on infinite lists? |
2022-06-22 16:06:57 +0200 | <mecharyuujin> | I thought foldr1 would need the last element of the list as the starting value, and even though it is useless in this case, how would GHC know that its useless here? Is GHC able to figure it out? |
2022-06-22 16:09:08 +0200 | <k`> | merijn, sclv, sm, thank you so much. I think I know what to do now. |
2022-06-22 16:09:35 +0200 | <merijn> | mecharyuujin: Consider this: Can you rewrite the application of head'' by replacing "foldr1" with the definition of foldr1? |
2022-06-22 16:09:47 +0200 | <merijn> | i.e. take foldr1, turn it into a lambda, insert in the code for head'' |
2022-06-22 16:09:57 +0200 | albet70 | (~xxx@2400:8902::f03c:92ff:fe60:98d8) (Remote host closed the connection) |
2022-06-22 16:10:10 +0200 | <k`> | Maybe if I ever get better at programming some of this will make it to Hackage. But considering how little I've improved in the last 14 years of using in Haskell, it seems unlikely :-) |
2022-06-22 16:12:06 +0200 | Vajb | (~Vajb@2001:999:40:4c50:1b24:879c:6df3:1d06) (Read error: Connection reset by peer) |
2022-06-22 16:12:12 +0200 | <sm> | it'll start to flow one of these days! |
2022-06-22 16:12:51 +0200 | Vajb | (~Vajb@hag-jnsbng11-58c3a8-176.dhcp.inet.fi) |
2022-06-22 16:15:01 +0200 | <carbolymer> | do you know if I can somehow put seed into hedgehog to not have random output from generators? |
2022-06-22 16:15:14 +0200 | <carbolymer> | or into tasty |
2022-06-22 16:15:53 +0200 | pleo | (~pleo@user/pleo) |
2022-06-22 16:16:04 +0200 | albet70 | (~xxx@2400:8902::f03c:92ff:fe60:98d8) |
2022-06-22 16:17:41 +0200 | <mecharyuujin> | merijn, I am not sure how I would turn foldr1 into a lambda. If I had to implement it, I would probably do it like |
2022-06-22 16:17:45 +0200 | jakalx | (~jakalx@base.jakalx.net) (Error from remote client) |
2022-06-22 16:17:47 +0200 | <mecharyuujin> | foldr1 f [x] = x |
2022-06-22 16:17:55 +0200 | <mecharyuujin> | foldr1 f (x:xs) = f x (foldr1 xs) |
2022-06-22 16:20:05 +0200 | jakalx | (~jakalx@base.jakalx.net) |
2022-06-22 16:21:50 +0200 | <merijn> | ok, so let's fill in the lambda from head'' in that code |
2022-06-22 16:22:07 +0200 | <merijn> | Clearly in the first case it will return the first item, yeah? |
2022-06-22 16:22:17 +0200 | <merijn> | So, let's look at the 2nd case |
2022-06-22 16:22:28 +0200 | <mecharyuujin> | yeah |
2022-06-22 16:22:30 +0200 | <merijn> | foldr1 f (x:xs) = f x (foldr1 f xs) |
2022-06-22 16:23:10 +0200 | <merijn> | Let's rename in head'' to get: head'' = foldr1 (\y _ -> y) |
2022-06-22 16:23:15 +0200 | <merijn> | (remove some name confusion) |
2022-06-22 16:24:06 +0200 | <merijn> | Actually, let's eta expand too: head'' (x:xs) = foldr1 (\y _ -> y) (x:xs) |
2022-06-22 16:24:46 +0200 | <merijn> | Expand foldr1 using it's definition and we get: (\y _ -> y) x (foldr1 (\y _ -> y) xs) |
2022-06-22 16:25:03 +0200 | <mecharyuujin> | Ah, I see |
2022-06-22 16:25:05 +0200 | <merijn> | Which will obviously return 'x' (so the head of the list) |
2022-06-22 16:25:06 +0200 | <mecharyuujin> | this is simply x |
2022-06-22 16:25:15 +0200 | <mecharyuujin> | Thanks a ton merijn, ! |
2022-06-22 16:25:29 +0200 | <merijn> | And laziness means we only evaluate the 2nd argument (the recursive foldr1 call) when needed (i.e. never) |
2022-06-22 16:25:38 +0200 | <mecharyuujin> | yeah |
2022-06-22 16:30:10 +0200 | eggplantade | (~Eggplanta@108-201-191-115.lightspeed.sntcca.sbcglobal.net) |
2022-06-22 16:32:44 +0200 | waleee | (~waleee@2001:9b0:213:7200:cc36:a556:b1e8:b340) (Ping timeout: 255 seconds) |
2022-06-22 16:33:10 +0200 | Infinite | (~Infinite@49.39.125.113) (Ping timeout: 252 seconds) |
2022-06-22 16:33:13 +0200 | fnurglewitz | (uid263868@id-263868.lymington.irccloud.com) (Quit: Connection closed for inactivity) |
2022-06-22 16:33:42 +0200 | HotblackDesiato | (~HotblackD@gateway/tor-sasl/hotblackdesiato) (Remote host closed the connection) |
2022-06-22 16:33:58 +0200 | HotblackDesiato | (~HotblackD@gateway/tor-sasl/hotblackdesiato) |
2022-06-22 16:34:41 +0200 | eggplantade | (~Eggplanta@108-201-191-115.lightspeed.sntcca.sbcglobal.net) (Ping timeout: 268 seconds) |
2022-06-22 16:35:43 +0200 | Sgeo | (~Sgeo@user/sgeo) |
2022-06-22 16:35:45 +0200 | <mecharyuujin> | How is foldl/foldl1 implemented? Using (init xs) and (last xs) doesn't seem particularly efficient... |
2022-06-22 16:39:00 +0200 | ccntrq1 | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) |
2022-06-22 16:39:03 +0200 | <k`> | mecharyuujin: foldl folds from right to left, starting with the accumulator value. |
2022-06-22 16:39:30 +0200 | <k`> | Think of it as a loop onto an accumulator rather than a fold like foldr. |
2022-06-22 16:39:42 +0200 | <k`> | Sorry, left to right. |
2022-06-22 16:39:50 +0200 | <k`> | Just like foldr. |
2022-06-22 16:41:13 +0200 | Timely_Ratio9567 | (~mecharyuu@2405:204:302a:37df:1901:27c8:4070:e6e2) |
2022-06-22 16:41:58 +0200 | ccntrq | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Ping timeout: 240 seconds) |
2022-06-22 16:42:01 +0200 | Timely_Ratio9567 | (~mecharyuu@2405:204:302a:37df:1901:27c8:4070:e6e2) (Client Quit) |
2022-06-22 16:42:12 +0200 | Timely_Ratio9567 | (~mecharyuu@2405:204:302a:37df:1901:27c8:4070:e6e2) |
2022-06-22 16:43:16 +0200 | Timely_Ratio9567 | (~mecharyuu@2405:204:302a:37df:1901:27c8:4070:e6e2) (Client Quit) |
2022-06-22 16:43:23 +0200 | ccntrq | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) |
2022-06-22 16:43:28 +0200 | Timely_Ratio9567 | (~mecharyuu@2405:204:302a:37df:1901:27c8:4070:e6e2) |
2022-06-22 16:43:29 +0200 | Timely_Ratio9567 | (~mecharyuu@2405:204:302a:37df:1901:27c8:4070:e6e2) (Remote host closed the connection) |
2022-06-22 16:43:32 +0200 | mecharyuujin | (~mecharyuu@2409:4050:ece:7592:439d:86ae:5a53:fec7) (Ping timeout: 255 seconds) |
2022-06-22 16:43:37 +0200 | ccntrq1 | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Ping timeout: 248 seconds) |
2022-06-22 16:45:18 +0200 | mecharyuujin | (~mecharyuu@2405:204:302a:37df:1901:27c8:4070:e6e2) |
2022-06-22 16:45:50 +0200 | _73 | (~user@pool-108-49-252-36.bstnma.fios.verizon.net) (Remote host closed the connection) |
2022-06-22 16:48:31 +0200 | Timely_Ratio9567 | (~mecharyuu@2409:4050:2d4b:a853:8048:c716:f88e:d09f) |
2022-06-22 16:51:05 +0200 | mecharyuujin | (~mecharyuu@2405:204:302a:37df:1901:27c8:4070:e6e2) (Ping timeout: 248 seconds) |
2022-06-22 16:51:42 +0200 | eggplantade | (~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) |
2022-06-22 16:59:36 +0200 | Timely_Ratio9567 | (~mecharyuu@2409:4050:2d4b:a853:8048:c716:f88e:d09f) (Quit: Leaving) |
2022-06-22 17:03:50 +0200 | lortabac | (~lortabac@2a01:e0a:541:b8f0:2cd:7ecf:235f:1481) (Quit: WeeChat 2.8) |
2022-06-22 17:10:13 +0200 | <tomsmeding> | @src foldl |
2022-06-22 17:10:13 +0200 | <lambdabot> | foldl f z [] = z |
2022-06-22 17:10:13 +0200 | <lambdabot> | foldl f z (x:xs) = foldl f (f z x) xs |
2022-06-22 17:13:44 +0200 | stackdroid18 | (14094@user/stackdroid) |
2022-06-22 17:24:59 +0200 | Unicorn_Princess | (~Unicorn_P@93-103-228-248.dynamic.t-2.net) (Remote host closed the connection) |
2022-06-22 17:28:05 +0200 | fweht | (uid404746@id-404746.lymington.irccloud.com) (Quit: Connection closed for inactivity) |
2022-06-22 17:29:31 +0200 | ccntrq | (~Thunderbi@exit-1.office.han.de.mhd.medondo.com) (Remote host closed the connection) |
2022-06-22 17:31:37 +0200 | eggplantade | (~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) (Remote host closed the connection) |
2022-06-22 17:33:42 +0200 | chele | (~chele@user/chele) (Remote host closed the connection) |
2022-06-22 17:33:54 +0200 | eggplantade | (~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) |
2022-06-22 17:36:02 +0200 | mc47 | (~mc47@xmonad/TheMC47) (Remote host closed the connection) |
2022-06-22 17:37:32 +0200 | Vajb | (~Vajb@hag-jnsbng11-58c3a8-176.dhcp.inet.fi) (Read error: Connection reset by peer) |
2022-06-22 17:37:43 +0200 | Vajb | (~Vajb@hag-jnsbng11-58c3a8-176.dhcp.inet.fi) |
2022-06-22 17:38:10 +0200 | Vajb | (~Vajb@hag-jnsbng11-58c3a8-176.dhcp.inet.fi) (Read error: Connection reset by peer) |
2022-06-22 17:38:50 +0200 | Vajb | (~Vajb@hag-jnsbng11-58c3a8-176.dhcp.inet.fi) |
2022-06-22 17:44:19 +0200 | mbuf | (~Shakthi@122.164.15.152) (Quit: Leaving) |
2022-06-22 17:45:22 +0200 | _xor | (~xor@74.215.182.83) |
2022-06-22 17:45:41 +0200 | Surobaki | (~surobaki@137.44.222.80) (Read error: Connection reset by peer) |
2022-06-22 17:47:31 +0200 | haritz | (~hrtz@user/haritz) (Remote host closed the connection) |
2022-06-22 17:49:50 +0200 | ridcully | (~ridcully@pd951f3bf.dip0.t-ipconnect.de) |
2022-06-22 17:51:37 +0200 | MajorBiscuit | (~MajorBisc@wlan-145-94-167-213.wlan.tudelft.nl) (Ping timeout: 256 seconds) |
2022-06-22 17:55:12 +0200 | werneta | (~werneta@70-142-214-115.lightspeed.irvnca.sbcglobal.net) (Ping timeout: 260 seconds) |
2022-06-22 17:57:06 +0200 | merijn | (~merijn@c-001-001-018.client.esciencecenter.eduvpn.nl) (Ping timeout: 264 seconds) |
2022-06-22 18:00:10 +0200 | shlevy[m] | (~shlevymat@2001:470:69fc:105::1:d3b1) (Quit: You have been kicked for being idle) |
2022-06-22 18:03:53 +0200 | <sm> | I give up. How do you get the current system locale ? |
2022-06-22 18:04:05 +0200 | <sm> | or time locale ? |
2022-06-22 18:05:25 +0200 | lagash | (lagash@lagash.shelltalk.net) |
2022-06-22 18:05:40 +0200 | <geekosaur> | afai8k you have to use the old-locale package to get the time locale. not sure about system locale unless it's buried in GHC.IO somewhere |
2022-06-22 18:06:41 +0200 | nate4 | (~nate@98.45.169.16) |
2022-06-22 18:06:42 +0200 | cfricke | (~cfricke@user/cfricke) (Ping timeout: 264 seconds) |
2022-06-22 18:06:58 +0200 | <sm> | that provides https://hackage.haskell.org/package/time-1.13/docs/Data-Time-Format.html#v:defaultTimeLocale , "Locale representing American usage." I'm not sure what that means now |
2022-06-22 18:08:07 +0200 | wagle | (~wagle@quassel.wagle.io) (Quit: http://quassel-irc.org - Chat comfortably. Anywhere.) |
2022-06-22 18:08:14 +0200 | <sm> | even though I'm using it plenty |
2022-06-22 18:08:23 +0200 | pavonia | (~user@user/siracusa) (Quit: Bye!) |
2022-06-22 18:08:37 +0200 | wagle | (~wagle@quassel.wagle.io) |
2022-06-22 18:09:03 +0200 | <sm> | alright, yes that's a constant. I want what's currently set eg with LC_TIME |
2022-06-22 18:09:15 +0200 | eggplantade | (~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) (Remote host closed the connection) |
2022-06-22 18:09:17 +0200 | terrorjack | (~terrorjac@2a01:4f8:1c1e:509a::1) (Quit: The Lounge - https://thelounge.chat) |
2022-06-22 18:10:25 +0200 | odnes | (~odnes@5-203-220-108.pat.nym.cosmote.net) (Remote host closed the connection) |
2022-06-22 18:10:47 +0200 | odnes | (~odnes@5-203-220-108.pat.nym.cosmote.net) |
2022-06-22 18:11:30 +0200 | nate4 | (~nate@98.45.169.16) (Ping timeout: 264 seconds) |
2022-06-22 18:13:32 +0200 | <sm> | https://stackoverflow.com/questions/28077322/getting-the-date-format-for-the-current-locale recommends current-locale (from 2015). I had tried this, but its TimeLocale is incompatible.. so not very useful. Strange.. |
2022-06-22 18:14:26 +0200 | <tomsmeding> | sm: the C way would be to call setlocale(LC_ALL, NULL), I guess you could bind that manually |
2022-06-22 18:14:53 +0200 | <tomsmeding> | the ghc repo (hence base) doesn't contain any relevant calls to setlocale, and neither 'time' nor 'old-locale' have any hits when searching for setlocale in the git repo |
2022-06-22 18:15:14 +0200 | <tomsmeding> | side note, "The setlocale() function is used to set or query the program's current locale." illustrates the great naming of that function |
2022-06-22 18:15:19 +0200 | <sm> | does it mean that basically no haskell programs are aware of system time locale, eg for parsing/printing localised month names ? |
2022-06-22 18:15:48 +0200 | <geekosaur> | xmonad binds setlocale but only to force it to locale "C" |
2022-06-22 18:20:30 +0200 | pleo | (~pleo@user/pleo) (Ping timeout: 264 seconds) |
2022-06-22 18:21:12 +0200 | Andrew | GNU\Andrew |
2022-06-22 18:21:26 +0200 | <tomsmeding> | also found this interesting library: https://hackage.haskell.org/package/env-locale-1.0.0.1/docs/src/System-Locale-Current.html#current… |
2022-06-22 18:21:49 +0200 | <tomsmeding> | the funny thing being, that 'prepare_locale' binds to the function at the top here https://hackage.haskell.org/package/env-locale-1.0.0.1/src/cbits/glue.c |
2022-06-22 18:22:13 +0200 | <tomsmeding> | oh wait I misread the manpage, disregard |
2022-06-22 18:22:42 +0200 | <tomsmeding> | sm: did you check that library already, or is the TimeLocale of that thing incompatible too? |
2022-06-22 18:23:26 +0200 | merijn | (~merijn@c-001-001-018.client.esciencecenter.eduvpn.nl) |
2022-06-22 18:24:39 +0200 | <tomsmeding> | the returned knownTimeZones is bogus though |
2022-06-22 18:26:20 +0200 | <sm> | tomsmeding: no I hadn't seen that one |
2022-06-22 18:26:37 +0200 | HotblackDesiato | (~HotblackD@gateway/tor-sasl/hotblackdesiato) (Remote host closed the connection) |
2022-06-22 18:27:26 +0200 | <sm> | looking closer, current-locale's is I guess the TimeLocale defined by old-locale, but Data.Time.Format expects the one defined by time. So I guess current-locale needs an update to use that |
2022-06-22 18:27:33 +0200 | HotblackDesiato | (~HotblackD@gateway/tor-sasl/hotblackdesiato) |
2022-06-22 18:28:16 +0200 | <sm> | env-locale's looks like the right one |
2022-06-22 18:28:29 +0200 | <tomsmeding> | Found using hackage search for "locale" :) |
2022-06-22 18:30:05 +0200 | yauhsien_ | (~yauhsien@61-231-23-53.dynamic-ip.hinet.net) |
2022-06-22 18:30:05 +0200 | yauhsien | (~yauhsien@61-231-23-53.dynamic-ip.hinet.net) (Read error: Connection reset by peer) |
2022-06-22 18:30:45 +0200 | <sm> | thanks tomsmeding |
2022-06-22 18:34:52 +0200 | jespada | (~jespada@cpc121022-nmal24-2-0-cust171.19-2.cable.virginm.net) (Ping timeout: 260 seconds) |
2022-06-22 18:35:09 +0200 | dlbh^ | (~dlbh@50.237.44.186) |
2022-06-22 18:36:05 +0200 | Feuermagier_ | (~Feuermagi@138.199.36.237) (Quit: Leaving) |
2022-06-22 18:36:16 +0200 | Feuermagier | (~Feuermagi@user/feuermagier) |
2022-06-22 18:36:34 +0200 | sjanssen | (~sjanssenm@2001:470:69fc:105::1:61d8) |
2022-06-22 18:37:17 +0200 | <tomsmeding> | sm: I guess part of the problem is that e.g. I have my system set to en_US.UTF8 despite there being an ocean between us |
2022-06-22 18:37:21 +0200 | eggplantade | (~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) |
2022-06-22 18:37:50 +0200 | <tomsmeding> | Gimme the original text please, don't go translating my compiler errors |
2022-06-22 18:38:23 +0200 | <tomsmeding> | I've seen gcc errors getting translated on another person's machine and boy is that awkward |
2022-06-22 18:38:58 +0200 | cheater1__ | (~Username@user/cheater) |
2022-06-22 18:39:00 +0200 | <tomsmeding> | Even apart from the fact that some errors aren't translated, and the flags aren't, etc |
2022-06-22 18:39:05 +0200 | jespada | (~jespada@cpc121022-nmal24-2-0-cust171.19-2.cable.virginm.net) |
2022-06-22 18:39:06 +0200 | cheater | (~Username@user/cheater) (Ping timeout: 264 seconds) |
2022-06-22 18:39:11 +0200 | cheater1__ | cheater |
2022-06-22 18:39:33 +0200 | <sm> | my context: someone wants hledger to parse their CSV dates correctly with %b recognising "abr" as april |
2022-06-22 18:39:58 +0200 | <tomsmeding> | Okay that makes a lot of sense, but I wouldn't want that to be dependent on the system locale |
2022-06-22 18:40:26 +0200 | <sm> | using the system locale would be the best default, no ? |
2022-06-22 18:40:31 +0200 | <tomsmeding> | Then you get excel-like shenanigand where soms systems want SUM(a, b) and others SUM(a; b), never mind SOM(a; b) in NL |
2022-06-22 18:40:58 +0200 | <tomsmeding> | I'd want en_US to be the default for consistency and reproducibility |
2022-06-22 18:41:11 +0200 | <tomsmeding> | But then as said I have my system locale set to that anyway :p |
2022-06-22 18:41:49 +0200 | <tomsmeding> | The thing being that if it recognises abr as April, then it doesn't recognise apr anymore (presumably) |
2022-06-22 18:41:57 +0200 | <sm> | well, I hear that. I was the same way about UTF8 (but lately I discovered I didn't enforce that from the start and people are reading with bizarro system encodings) |
2022-06-22 18:42:13 +0200 | <sm> | like "latin-1" |
2022-06-22 18:42:44 +0200 | <k`> | sm: I'm so sorry to hear that. |
2022-06-22 18:43:04 +0200 | <tomsmeding> | Is there a "system encoding", and is that set to latin-1 in those cases? |
2022-06-22 18:43:42 +0200 | <sm> | yes, I'm afraid there is and it is |
2022-06-22 18:44:08 +0200 | <tomsmeding> | ._. |
2022-06-22 18:44:09 +0200 | <sm> | tomsmeding: just to be clear, you'd favour sticking with en_US as default, but allowing user to override it at run time ? |
2022-06-22 18:44:37 +0200 | <tomsmeding> | Yes, and same for UTF8 actually - but apparentlt that ship has sailed. But this is just my opinion :) |
2022-06-22 18:44:46 +0200 | <sm> | and when I say people, I mean one guy. |
2022-06-22 18:45:01 +0200 | <tomsmeding> | :p |
2022-06-22 18:45:17 +0200 | <tomsmeding> | There are also people still running windows xp |
2022-06-22 18:45:28 +0200 | <sm> | revisiting the UTF8 thing is actually the current top priority hledger issue. But I'm taking a break as I got sick of it :) |
2022-06-22 18:46:46 +0200 | <tomsmeding> | Might even have an environment variable that instructs hledger to use a particular (or the system) locale, so that one doesn't have to set that each time, or to use a shell alias |
2022-06-22 18:46:51 +0200 | <tomsmeding> | But yes |
2022-06-22 18:47:18 +0200 | sm | guesses encoding and time locale should probably handled the same way, whatever that is |
2022-06-22 18:47:19 +0200 | <k`> | Hecate: Thoughts on Haskell parsing locale and then letting you write `classe Traversable (Soit c) ou traverser f = soit (pur . Gauche) (fmap Droite . f)` ? |
2022-06-22 18:47:40 +0200 | <tomsmeding> | Understandable to get sick from locales and encodings, my burn with locales was when my (C++) code started failing to parse my save files when I added a user interface |
2022-06-22 18:48:06 +0200 | even4void | (even4void@came.here.for-some.fun) (Quit: fBNC - https://bnc4free.com) |
2022-06-22 18:48:06 +0200 | xacktm | (xacktm@user/xacktm) (Quit: fBNC - https://bnc4free.com) |
2022-06-22 18:48:38 +0200 | <tomsmeding> | Turned out that that system had an nl_NL locale set for numeric, and my file format used floats, and the gtk library calls setlocale(LC_ALL, "") -- previously I'd unknowingly been running in the default, namely C |
2022-06-22 18:49:15 +0200 | <sm> | lovely |
2022-06-22 18:49:25 +0200 | leeb | (~leeb@KD106155002239.au-net.ne.jp) (Ping timeout: 256 seconds) |
2022-06-22 18:49:37 +0200 | econo | (uid147250@user/econo) |
2022-06-22 18:49:37 +0200 | andreas303 | (andreas303@ip227.orange.bnc4free.com) (Quit: fBNC - https://bnc4free.com) |
2022-06-22 18:50:45 +0200 | <tomsmeding> | (we use , for decimals over here) |
2022-06-22 18:51:21 +0200 | <k`> | I have 'current format' set to English(Sweden). Wonder what that's subtly messing up. |
2022-06-22 18:51:45 +0200 | <yushyin> | tomsmeding: en_IE.UTF-8 is my preferred locale :) |
2022-06-22 18:57:30 +0200 | <tomsmeding> | yushyin: why specifically IE? (Are you in Ireland?) |
2022-06-22 18:58:05 +0200 | vysn | (~vysn@user/vysn) |
2022-06-22 18:58:28 +0200 | gurkenglas | (~gurkengla@dslb-002-207-014-022.002.207.pools.vodafone-ip.de) |
2022-06-22 19:00:43 +0200 | <tomsmeding> | I've heard that en_DK is ideal because they apparently use the yyyy-mm-dd date format — if I don't misremember |
2022-06-22 19:00:55 +0200 | jakalx | (~jakalx@base.jakalx.net) (Error from remote client) |
2022-06-22 19:00:59 +0200 | <k`> | Sweden does too. |
2022-06-22 19:01:14 +0200 | coot | (~coot@213.134.190.95) (Quit: coot) |
2022-06-22 19:01:29 +0200 | <tomsmeding> | Ah |
2022-06-22 19:02:13 +0200 | <tomsmeding> | Can you guys please convert the rest of the world |
2022-06-22 19:02:57 +0200 | <k`> | Sorry, I'm just using the Sweding locale to get yyy-mm-dd! |
2022-06-22 19:03:05 +0200 | <k`> | *Swedish |
2022-06-22 19:04:02 +0200 | jakalx | (~jakalx@base.jakalx.net) |
2022-06-22 19:04:15 +0200 | <yushyin> | tomsmeding: i wanted something that uses the metric system, sane date format i.e. dd/mm/yyyy and '.' for decimal separator. en_IE was the first thing I came across that fulfilled these conditions |
2022-06-22 19:04:18 +0200 | notzmv | (~zmv@user/notzmv) (Ping timeout: 240 seconds) |
2022-06-22 19:07:37 +0200 | brettgilio | (~brettgili@virtlab.gq) (Ping timeout: 248 seconds) |
2022-06-22 19:07:50 +0200 | lisbeths | (uid135845@id-135845.lymington.irccloud.com) (Quit: Connection closed for inactivity) |
2022-06-22 19:11:01 +0200 | <sm> | ha |
2022-06-22 19:11:49 +0200 | <sm> | I like en_IE too, but I no longer think dd/mm/yyyy is the greatest format |
2022-06-22 19:11:56 +0200 | dlbh^ | (~dlbh@50.237.44.186) (Ping timeout: 268 seconds) |
2022-06-22 19:12:45 +0200 | <EvanR> | dy/ym/dyym to keep things spicy |
2022-06-22 19:12:53 +0200 | andreas303 | (andreas303@ip227.orange.bnc4free.com) |
2022-06-22 19:13:44 +0200 | sm | actually tried to parse that... day of year... year month... day of <explodes> |
2022-06-22 19:13:48 +0200 | <k`> | EvanR: Can I get you on board with 3-space indents and comments in Interlingua as a standard? |
2022-06-22 19:14:43 +0200 | <yushyin> | yyyy-mm-dd is indeed more fancy, but currency with en_DK is DKK and with en_IE it is EUR |
2022-06-22 19:14:51 +0200 | yauhsien_ | (~yauhsien@61-231-23-53.dynamic-ip.hinet.net) (Remote host closed the connection) |
2022-06-22 19:15:18 +0200 | <k`> | You can set LC_MONETARY to something different than LC_TIME. |
2022-06-22 19:15:44 +0200 | <EvanR> | forgot about interlingua |
2022-06-22 19:16:44 +0200 | unit73e | (~emanuel@2001:818:e8dd:7c00:32b5:c2ff:fe6b:5291) |
2022-06-22 19:17:16 +0200 | vhs | (~vhs@c188-151-104-121.bredband.tele2.se) |
2022-06-22 19:17:48 +0200 | <yushyin> | k`: i know! but it was nice to find a locale that more or less is good enough without much mixing different locales |
2022-06-22 19:19:01 +0200 | <EvanR> | en_STATELESS_AND_LOVIN_IT |
2022-06-22 19:19:24 +0200 | <yushyin> | :D |
2022-06-22 19:20:37 +0200 | even4void | (even4void@came.here.for-some.fun) |
2022-06-22 19:22:46 +0200 | <k`> | Glad you can put a positive spin on it... |
2022-06-22 19:24:58 +0200 | vhs | (~vhs@c188-151-104-121.bredband.tele2.se) (Quit: Leaving) |
2022-06-22 19:25:31 +0200 | azimut | (~azimut@gateway/tor-sasl/azimut) (Ping timeout: 268 seconds) |
2022-06-22 19:25:47 +0200 | vhs | (~vhs@c188-151-104-121.bredband.tele2.se) |
2022-06-22 19:26:14 +0200 | xacktm | (xacktm@user/xacktm) |
2022-06-22 19:26:45 +0200 | stiell | (~stiell@gateway/tor-sasl/stiell) (Ping timeout: 268 seconds) |
2022-06-22 19:27:25 +0200 | <monochrom> | day of <explode> = dies irae >:) |
2022-06-22 19:27:37 +0200 | tzh | (~tzh@c-24-21-73-154.hsd1.or.comcast.net) |
2022-06-22 19:28:21 +0200 | vhs | (~vhs@c188-151-104-121.bredband.tele2.se) (Client Quit) |
2022-06-22 19:28:58 +0200 | vhs | (~vhs@c188-151-104-121.bredband.tele2.se) |
2022-06-22 19:30:01 +0200 | jinsun | (~jinsun@user/jinsun) (Ping timeout: 248 seconds) |
2022-06-22 19:30:09 +0200 | <geekosaur> | EvanR: d₂y₃/y₂m₂/d₁y₄y₁m₁ |
2022-06-22 19:30:28 +0200 | vhs | (~vhs@c188-151-104-121.bredband.tele2.se) (Client Quit) |
2022-06-22 19:31:07 +0200 | vhs | (~vhs@c188-151-104-121.bredband.tele2.se) |
2022-06-22 19:31:27 +0200 | stiell | (~stiell@gateway/tor-sasl/stiell) |
2022-06-22 19:34:49 +0200 | vhs | (~vhs@c188-151-104-121.bredband.tele2.se) (Client Quit) |
2022-06-22 19:37:53 +0200 | jinsun | (~jinsun@user/jinsun) |
2022-06-22 19:38:25 +0200 | Everything | (~Everythin@37.115.210.35) (Quit: leaving) |
2022-06-22 19:38:25 +0200 | <EvanR> | nice, spontaneous symmetry breaking |
2022-06-22 19:41:02 +0200 | eggplantade | (~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) (Remote host closed the connection) |
2022-06-22 19:41:16 +0200 | eggplantade | (~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) |
2022-06-22 19:41:19 +0200 | dlbh^ | (~dlbh@50.237.44.186) |
2022-06-22 19:41:57 +0200 | mjs22 | (~mjs22@76.115.19.239) |
2022-06-22 19:42:51 +0200 | yauhsien | (~yauhsien@61-231-23-53.dynamic-ip.hinet.net) |
2022-06-22 19:48:55 +0200 | waleee | (~waleee@2001:9b0:213:7200:cc36:a556:b1e8:b340) |
2022-06-22 19:52:40 +0200 | Unicorn_Princess | (~Unicorn_P@93-103-228-248.dynamic.t-2.net) |
2022-06-22 19:55:03 +0200 | Infinite | (~Infinite@2405:204:5381:d6e2:c80:a1c9:d209:de50) |
2022-06-22 19:57:37 +0200 | raym | (~raym@user/raym) (Remote host closed the connection) |
2022-06-22 19:59:58 +0200 | raehik | (~raehik@cpc95906-rdng25-2-0-cust156.15-3.cable.virginm.net) (Ping timeout: 240 seconds) |
2022-06-22 20:02:29 +0200 | _ht | (~quassel@231-169-21-31.ftth.glasoperator.nl) |
2022-06-22 20:04:26 +0200 | raym | (~raym@user/raym) |
2022-06-22 20:05:41 +0200 | <shapr> | I published my first thing to hackage yay! https://hackage.haskell.org/package/takedouble |
2022-06-22 20:11:46 +0200 | <tomsmeding> | shapr: nice and compact :) |
2022-06-22 20:11:54 +0200 | <cjay> | nice, congrats :) |
2022-06-22 20:11:58 +0200 | shapr | dances cheerfully |
2022-06-22 20:21:13 +0200 | vysn | (~vysn@user/vysn) (Ping timeout: 248 seconds) |
2022-06-22 20:21:54 +0200 | eggplantade | (~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) (Remote host closed the connection) |
2022-06-22 20:32:59 +0200 | __monty__ | (~toonn@user/toonn) |
2022-06-22 20:33:22 +0200 | misterfish | (~misterfis@ip214-130-173-82.adsl2.static.versatel.nl) |
2022-06-22 20:37:03 +0200 | <shapr> | When I ran "cabal check" I got a warning that users may not need "-O2" for my package, how do I test whether it makes a difference? |
2022-06-22 20:37:18 +0200 | Infinite9 | (~Infinite@2405:204:5381:d6e2:c147:f74f:65d9:3fcf) |
2022-06-22 20:37:22 +0200 | Infinite | (~Infinite@2405:204:5381:d6e2:c80:a1c9:d209:de50) (Ping timeout: 252 seconds) |
2022-06-22 20:37:44 +0200 | <shapr> | Is there perhaps a known criterion workflow that can tell me? |
2022-06-22 20:38:31 +0200 | <Infinite9> | I'm trying to understand this line: dirs@Dirs{..} <- getAllDirs. |
2022-06-22 20:38:32 +0200 | <Infinite9> | The <- gets me Dir from IO Dir. Then Dirs{..} destructures so that we don't need to specify all the elements of the record. But I don't understand the @ here. I tried looking it up and visible type applications came up. If that's so, is the '@' conforming the type of dirs to Dir? https://pastebin.com/Ck4tWmBs |
2022-06-22 20:38:59 +0200 | <shapr> | Infinite9: the entire result is assigned to the name 'dirs' |
2022-06-22 20:39:29 +0200 | <monochrom> | Look for "as patterns" instead. This is just Haskell 2010 (and 98, and ...) |
2022-06-22 20:40:07 +0200 | <monochrom> | This is also why Google is still not sentient. |
2022-06-22 20:41:32 +0200 | <Infinite9> | monochrom thanks this helped |
2022-06-22 20:41:32 +0200 | <Infinite9> | Actually, I just randomly entered 'a@b' in ghci and it said "Did you mean to enable TypeApplications?" so I tried looking that up. |
2022-06-22 20:41:57 +0200 | <EvanR> | oof |
2022-06-22 20:42:18 +0200 | <k`> | Think I'm in a very small minority here, but for that and a few other reasons I am not a fan of type applications. |
2022-06-22 20:42:35 +0200 | <k`> | Would much rather write (a :: b). |
2022-06-22 20:42:52 +0200 | <EvanR> | @ is doing double duty here |
2022-06-22 20:43:03 +0200 | <geekosaur> | not fond of them either. some people seem to love them, others consider them a mistake |
2022-06-22 20:43:37 +0200 | <geekosaur> | as patterns didn't get mentioned in ghci because you were in an expression as far as ghci was concerned, whereas as-patterns are part of pattern syntax |
2022-06-22 20:44:04 +0200 | <EvanR> | someone go back in time and increase the universe of ascii characters slightly |
2022-06-22 20:44:44 +0200 | maerwald | (~maerwald@user/maerwald) (Ping timeout: 255 seconds) |
2022-06-22 20:46:39 +0200 | <geekosaur> | shapr, just time it with and without. beware that -O2 can actually slow things down in some cases |
2022-06-22 20:46:48 +0200 | <monochrom> | Oh ghci is not sentient either. |
2022-06-22 20:47:11 +0200 | <geekosaur> | so cabal strongly encourages you to use -O / -O1 instead |
2022-06-22 20:47:21 +0200 | <monochrom> | This is why I am against error messages doing second-guessing. |
2022-06-22 20:47:36 +0200 | <geekosaur> | and the ghc manual tells you -O2 is usually wasted time both in compilation and runni8ng |
2022-06-22 20:49:05 +0200 | <int-e> | hmm, is that true though? |
2022-06-22 20:49:26 +0200 | <int-e> | (The latter; the former... ugh, please leave trading compilation time for runtime to the user!) |
2022-06-22 20:49:47 +0200 | <geekosaur> | more specifically what it says is it usually slows compilation significantly while providing little if any benefit and occasionally making things worse, iirc |
2022-06-22 20:50:25 +0200 | <int-e> | I should do my own profiling. Not saying that I will... |
2022-06-22 20:51:54 +0200 | Pickchea | (~private@user/pickchea) |
2022-06-22 20:54:03 +0200 | <monochrom> | I just idly wonder if the GHC user's guide is outdated on this. |
2022-06-22 20:54:35 +0200 | <int-e> | Well, maybe a sample: one random and tiny program sees a speedup of 15% from using -O2. And it takes just enough time for the runtime improvement to outweight the extra compilation time. |
2022-06-22 20:55:01 +0200 | <int-e> | (compile + execute is in the 2.5s ballpark for this sample) |
2022-06-22 20:55:02 +0200 | <k`> | I just idly wonder if `lens` is the package that most benefits from -O2, and yet is the package you least want to spend more time compiling. |
2022-06-22 20:55:43 +0200 | <monochrom> | haha |
2022-06-22 20:55:45 +0200 | <int-e> | k`: have you tried building regex-tdfa or haskell-src-exts? |
2022-06-22 20:55:46 +0200 | <k`> | (Much love to Ed K. for making one of the most beautiful, useful packages on Hackage.) |
2022-06-22 20:56:24 +0200 | <k`> | int-e: No. Are you saying that a regex library takes a long time to compile? |
2022-06-22 20:56:36 +0200 | <int-e> | this particular one does, IME |
2022-06-22 20:56:54 +0200 | <int-e> | I never looked into it though. |
2022-06-22 20:57:02 +0200 | <int-e> | (So I don't know why) |
2022-06-22 20:57:25 +0200 | <monochrom> | My idle wonder cuts both ways. bytestring and vector needed -O2 a decade ago. I also idly wonder whether today they still do. |
2022-06-22 20:57:57 +0200 | maerwald | (~maerwald@mail.hasufell.de) |
2022-06-22 20:58:27 +0200 | <monochrom> | Forgive me for not even asking in #ghc, today is a hot day and I feel like chilling out and slacking off :) |
2022-06-22 21:00:16 +0200 | <monochrom> | But I'm happy enough that -O1 already does wonder and is the cabal default. |
2022-06-22 21:00:37 +0200 | <dolio> | I don't really understand why it matters much for those examples. |
2022-06-22 21:01:12 +0200 | <dolio> | If you're working on them, then I understand caring. But people using them recompile them like twice a year. |
2022-06-22 21:01:44 +0200 | <monochrom> | So in the case of bytestring and vector, my recollection is that -O2 turns on the last mile of aggressive fusion that they direly need. |
2022-06-22 21:02:51 +0200 | <monochrom> | So my guess is that takedouble does not need -O2. |
2022-06-22 21:03:27 +0200 | <monochrom> | takedouble is I/O-bound. It probably spends more time waiting for the OS. |
2022-06-22 21:03:31 +0200 | maerwald | (~maerwald@mail.hasufell.de) (Changing host) |
2022-06-22 21:03:31 +0200 | maerwald | (~maerwald@user/maerwald) |
2022-06-22 21:08:21 +0200 | pleo | (~pleo@user/pleo) |
2022-06-22 21:13:58 +0200 | notzmv | (~zmv@user/notzmv) |
2022-06-22 21:20:57 +0200 | machinedgod | (~machinedg@66.244.246.252) (Ping timeout: 248 seconds) |
2022-06-22 21:21:12 +0200 | odnes | (~odnes@5-203-220-108.pat.nym.cosmote.net) (Quit: Leaving) |
2022-06-22 21:21:37 +0200 | machinedgod | (~machinedg@66.244.246.252) |
2022-06-22 21:22:18 +0200 | eggplantade | (~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) |
2022-06-22 21:26:49 +0200 | eggplantade | (~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) (Ping timeout: 248 seconds) |
2022-06-22 21:27:25 +0200 | kannon | (~NK@135-180-47-54.fiber.dynamic.sonic.net) |
2022-06-22 21:28:14 +0200 | <Franciman> | sm: thank you very much for the podcast link, i'm enjoying it a lot |
2022-06-22 21:30:39 +0200 | <shapr> | is there some way a running haskell binary can ask cabal for the modules in the library stanza? |
2022-06-22 21:30:45 +0200 | szkl | (uid110435@id-110435.uxbridge.irccloud.com) |
2022-06-22 21:30:49 +0200 | <shapr> | I should probably move this to #haskell-in-depth again |
2022-06-22 21:31:25 +0200 | eggplantade | (~Eggplanta@108-201-191-115.lightspeed.sntcca.sbcglobal.net) |
2022-06-22 21:33:21 +0200 | <monochrom> | A running haskell binary may be running on a computer that has no cabal in the first place. |
2022-06-22 21:34:02 +0200 | <shapr> | yeah, true |
2022-06-22 21:34:34 +0200 | kimjetwav | (~user@2607:fea8:2340:da00:59a1:33be:cb76:515a) |
2022-06-22 21:34:35 +0200 | dumptruckman | (~dumptruck@45-79-173-88.ip.linodeusercontent.com) (Quit: ZNC - https://znc.in) |
2022-06-22 21:35:10 +0200 | <int-e> | For installed libraries, `ghc-pkg describe` has that kind of information. |
2022-06-22 21:35:15 +0200 | <shapr> | oh interesting |
2022-06-22 21:36:15 +0200 | <geekosaur[m]> | But a running program doesn't even know what libraries it's using |
2022-06-22 21:36:32 +0200 | <int-e> | sure, I shifted the goalpost to somewhere reachable |
2022-06-22 21:36:34 +0200 | <shapr> | I could shell out, but it's probably more trouble than it's worth at this stage |
2022-06-22 21:36:42 +0200 | <kannon> | hi, in this program, why the main in the if/else clause? It works the same without it: https://paste.tomsmeding.com/WYmwl13U |
2022-06-22 21:37:30 +0200 | <kannon> | edited: https://paste.tomsmeding.com/o8LJNdMJ |
2022-06-22 21:38:56 +0200 | <int-e> | hmm that's missing a "=" |
2022-06-22 21:39:15 +0200 | <int-e> | ...so if by "it works the same" you mean that neither version is working... |
2022-06-22 21:39:42 +0200 | <kannon> | sorry yeah second edit https://paste.tomsmeding.com/XgSOUg7v |
2022-06-22 21:39:48 +0200 | <int-e> | but the idea here is to start over, asking for another line of input when the last input wasn't "quit" |
2022-06-22 21:39:51 +0200 | <Maxdamantus> | 00:50:20 < maerwald> what's the gain |
2022-06-22 21:40:13 +0200 | <monochrom> | I mean why not? This is just plain recursion expressing a plain loop. |
2022-06-22 21:40:53 +0200 | <Maxdamantus> | maerwald: I don't think it requires any of the things you listed (from the user of Haskell), but I guess to summarise the overall gain, it means that in general, it should be harder for code to be incorrect at handling data. |
2022-06-22 21:40:55 +0200 | <int-e> | So I'm not sure in which way the behavior is the same without that line... maybe this indicates lack of testing :P |
2022-06-22 21:41:13 +0200 | <kannon> | int-e I had it written correctly in ghci. they both worked whether main was in the clause or not.. |
2022-06-22 21:41:30 +0200 | <int-e> | "worked" |
2022-06-22 21:41:44 +0200 | <int-e> | well, that may be the case, you didn't supply a specification |
2022-06-22 21:41:54 +0200 | <monochrom> | Yeah I call confirmation bias. |
2022-06-22 21:41:59 +0200 | <Maxdamantus> | maerwald: with the `ShortString` mechanism, someone clever could manually encode their UTF-8 or UTF-16 strings to `ShortString` without going through the proper APIs, and then they'll have code that seems to work on one platform but fails on the other platform even for well-formed Unicode. |
2022-06-22 21:42:28 +0200 | <kannon> | specification ? int-e |
2022-06-22 21:42:53 +0200 | <int-e> | "it works" is essentially devoid of meaning |
2022-06-22 21:43:09 +0200 | <Maxdamantus> | maerwald: and I have a feeling there could be security issues due to the mixing of encoding forms (that is, because `ShortString` sometimes represents UTF-8 and sometimes represents UTF-16). |
2022-06-22 21:43:16 +0200 | <int-e> | because it doesn't say what the expected behavior is |
2022-06-22 21:43:46 +0200 | <int-e> | kannon: http://paste.debian.net/1244895/ <-- this won't work the same way if you drop the call to `main` inside `main`. |
2022-06-22 21:45:18 +0200 | <kannon> | one moment thanks int-e |
2022-06-22 21:45:31 +0200 | <Maxdamantus> | maerwald: I think UTF-8 was pretty much designed around this principle. Unless you're usingh `wchar_t`, it's actually kind of hard to write C code that doesn't handle UTF-8 properly. |
2022-06-22 21:45:46 +0200 | <maerwald> | Maxdamantus: uhm... the UTF-8 roundtripping has security issues |
2022-06-22 21:45:57 +0200 | <maerwald> | see https://unicode.org/L2/L2009/09236-pep383-problems.html |
2022-06-22 21:46:04 +0200 | <maerwald> | and http://blog.omega-prime.co.uk/2011/03/29/security-implications-of-pep-383/ |
2022-06-22 21:46:05 +0200 | juri_ | (~juri@79.140.115.124) (Read error: Connection reset by peer) |
2022-06-22 21:46:18 +0200 | <maerwald> | if you don't touch the filepath encodings, there are none of those issues |
2022-06-22 21:46:20 +0200 | dumptruckman | (~dumptruck@23-239-13-163.ip.linodeusercontent.com) |
2022-06-22 21:47:12 +0200 | <maerwald> | also: broken serialisation, broken equality checks, etc. |
2022-06-22 21:47:15 +0200 | juri_ | (~juri@79.140.115.124) |
2022-06-22 21:47:21 +0200 | juri_ | (~juri@79.140.115.124) (Read error: Connection reset by peer) |
2022-06-22 21:47:54 +0200 | <Maxdamantus> | maerwald: presumably you would be talking about security issues in my solution (using WTF-8) specifically on Windows? |
2022-06-22 21:48:06 +0200 | <maerwald> | no, I'm talking about PEP 383 |
2022-06-22 21:48:08 +0200 | <Maxdamantus> | on Linux the conversion is a no-op. |
2022-06-22 21:48:21 +0200 | <maerwald> | which current Haskell code is using |
2022-06-22 21:48:22 +0200 | mc47 | (~mc47@xmonad/TheMC47) |
2022-06-22 21:48:29 +0200 | dlbh^ | (~dlbh@50.237.44.186) (Ping timeout: 256 seconds) |
2022-06-22 21:48:44 +0200 | <maerwald> | however, *without* enforcing UTF-8 |
2022-06-22 21:49:20 +0200 | <maerwald> | and PEP 383 doesn't work for every encoding. It's only "total" under fully roundtrippable encodings and those that are ASCII supersets |
2022-06-22 21:50:05 +0200 | <maerwald> | the alternative would be forcing UTF-8 for all haskell code... then all your non-UTF8 filepaths have odd representations in Haskell |
2022-06-22 21:50:21 +0200 | <maerwald> | but they would at least be roundtrippable |
2022-06-22 21:50:34 +0200 | <kannon> | int-e: thanks I see the difference. cheers |
2022-06-22 21:50:35 +0200 | <maerwald> | but now you lost the original encoding, lol |
2022-06-22 21:50:43 +0200 | <Maxdamantus> | bytes are always roundtrippable, because there's no conversion. |
2022-06-22 21:50:48 +0200 | <Maxdamantus> | gtg |
2022-06-22 21:51:09 +0200 | <maerwald> | Maxdamantus: no,they are not |
2022-06-22 21:51:15 +0200 | <Maxdamantus> | or do you mean roundtrippable from [Char]? |
2022-06-22 21:51:34 +0200 | <maerwald> | https://peps.python.org/pep-0383/ |
2022-06-22 21:51:37 +0200 | <Maxdamantus> | I think conversion from [Char] to filenames should just emit replacement characters on error. |
2022-06-22 21:51:49 +0200 | <Maxdamantus> | (eg, when using the PEP-383 encoding) |
2022-06-22 21:51:54 +0200 | <Maxdamantus> | I'm familiar with PEP-383. |
2022-06-22 21:52:16 +0200 | juri_ | (~juri@79.140.115.124) |
2022-06-22 21:53:02 +0200 | <maerwald> | the problem now is also that conversion functions running on your filepath have to understand the meaning of those PEP-383 high surrogate pairs |
2022-06-22 21:53:08 +0200 | <maerwald> | or they might create security bugs |
2022-06-22 21:54:25 +0200 | <maerwald> | PEP-383 is only safe, if the user does nothing with the filepaths, but just passes them around |
2022-06-22 21:56:32 +0200 | <maerwald> | I dunno... why not just stop messing with them :p |
2022-06-22 21:56:39 +0200 | kannon | (~NK@135-180-47-54.fiber.dynamic.sonic.net) (Quit: leaving) |
2022-06-22 21:57:04 +0200 | <EvanR> | formal abstract filepath algebra |
2022-06-22 21:57:27 +0200 | <EvanR> | filepath semigroupoids |
2022-06-22 21:57:52 +0200 | <EvanR> | don't worry about what they are, only worry about where they go |
2022-06-22 21:58:09 +0200 | <Maxdamantus> | maerwald: this only applies to conversion from [Char], which should emit replacement characters for surrogate Char values. |
2022-06-22 21:58:38 +0200 | <maerwald> | Maxdamantus: huh? |
2022-06-22 21:58:51 +0200 | juri_ | (~juri@79.140.115.124) (Read error: Connection reset by peer) |
2022-06-22 21:58:57 +0200 | <maerwald> | if you emit replacement char, you break the semantics |
2022-06-22 21:59:10 +0200 | juri_ | (~juri@79.140.115.124) |
2022-06-22 21:59:11 +0200 | <maerwald> | you might even delete a wrong file :p |
2022-06-22 21:59:40 +0200 | <Maxdamantus> | you can't in general round trip with [Char] to filenames. |
2022-06-22 22:00:00 +0200 | <Maxdamantus> | If you could, we could just continue using that for representing filenames. |
2022-06-22 22:00:06 +0200 | <maerwald> | roundtripping is well defined for UTF-8 with PEP 383 |
2022-06-22 22:00:14 +0200 | <maerwald> | you can roundtrip any bytestring through that afaik |
2022-06-22 22:00:31 +0200 | <EvanR> | (but how do you utf-8 encode a surrogate Char) |
2022-06-22 22:00:54 +0200 | <EvanR> | or is that an obvious |
2022-06-22 22:01:47 +0200 | Infinite9 | (~Infinite@2405:204:5381:d6e2:c147:f74f:65d9:3fcf) (Quit: Client closed) |
2022-06-22 22:02:51 +0200 | <Maxdamantus> | Well, you could do it that way, but that change would probably break current Haskell code. |
2022-06-22 22:03:42 +0200 | juri_ | (~juri@79.140.115.124) (Ping timeout: 264 seconds) |
2022-06-22 22:03:45 +0200 | dlbh^ | (~dlbh@50.237.44.186) |
2022-06-22 22:04:30 +0200 | juri_ | (~juri@84-19-175-179.pool.ovpn.com) |
2022-06-22 22:04:31 +0200 | <Maxdamantus> | I think your method results in security issues because of the different representations of well-formed Unicode. |
2022-06-22 22:04:50 +0200 | <maerwald> | Maxdamantus: there's no unicode in abstract filepath. |
2022-06-22 22:05:31 +0200 | eggplantade | (~Eggplanta@108-201-191-115.lightspeed.sntcca.sbcglobal.net) (Remote host closed the connection) |
2022-06-22 22:05:33 +0200 | superbil | (~superbil@1-34-176-171.hinet-ip.hinet.net) (Ping timeout: 265 seconds) |
2022-06-22 22:06:03 +0200 | <Maxdamantus> | eg, "à" is sometimes represented as [0x00c3, 0x00a0] and sometimes represented as [0x00e1| |
2022-06-22 22:06:19 +0200 | <maerwald> | Maxdamantus: those are different platforms |
2022-06-22 22:07:00 +0200 | <Maxdamantus> | if someone accidentally produces the wrong coding for the platform, you've got a string that is equal to the interpretation of ill-formed unicode. |
2022-06-22 22:07:14 +0200 | <Maxdamantus> | that's where security issues arise. |
2022-06-22 22:07:17 +0200 | <maerwald> | Maxdamantus: I don't understand what that means |
2022-06-22 22:08:10 +0200 | nate4 | (~nate@98.45.169.16) |
2022-06-22 22:08:28 +0200 | <maerwald> | 1. there's no such thing as "wrong encoding" for abstract filepath, 2. they are distinct across platforms (they don't even have the same constructor)... so it's not even possible to accidentially compare a windows filepath with a unix filepath. That doesn't compile. |
2022-06-22 22:09:11 +0200 | <maerwald> | you'd have to explicitly convert them to ByteString or ShortByteString at which point, the library has no business with what you're doing anymore |
2022-06-22 22:09:39 +0200 | <Maxdamantus> | maerwald: if someone hardcodes [0x00e1] into their program because it works on Windows, then the program is run on Linux, I can match that string by providing the ill-formed UTF-8, <E1> |
2022-06-22 22:10:05 +0200 | <maerwald> | Maxdamantus: how would the user do that? |
2022-06-22 22:10:29 +0200 | <maerwald> | there is no *safe* function to do that |
2022-06-22 22:11:22 +0200 | <Maxdamantus> | maerwald: by providing a filename on Linux that contains thatminvalid UTF-8. |
2022-06-22 22:12:10 +0200 | <Maxdamantus> | maerwald: maybe it's a program that scans through a directory and executes a file if it's called "à" |
2022-06-22 22:12:13 +0200 | <maerwald> | this makes no sense to me... you're saying users can use unsafe API to construct wrong filepaths and then claim that's the fault of the library? |
2022-06-22 22:12:24 +0200 | johnw | (~johnw@76-234-69-149.lightspeed.frokca.sbcglobal.net) |
2022-06-22 22:12:27 +0200 | <maerwald> | you can already do that today with string based filepaths by switching encoding in between |
2022-06-22 22:13:11 +0200 | <monochrom> | unsafePerformIO comes from the library. It is the fault of the library. :) |
2022-06-22 22:13:14 +0200 | nate4 | (~nate@98.45.169.16) (Ping timeout: 268 seconds) |
2022-06-22 22:13:36 +0200 | <Maxdamantus> | maerwald: the admin has other ways ofmpreventing people from making "à" files, but it turns out that that's not the actual name being tested. |
2022-06-22 22:13:49 +0200 | <EvanR> | now I'm imagining an idealized program which has a clean separation between pure code and the OS API, and being allowed to use both linux and windows at will, somehow xD |
2022-06-22 22:13:58 +0200 | <Maxdamantus> | anyway, need to stop typing. on phone on a bus and my hands are really cold. |
2022-06-22 22:14:29 +0200 | <maerwald> | Maxdamantus: I think you should check out the API. You'll see that it isn't easy to do what you're suggesting without either using internal modules or using functions that have the *unsafe* prefix |
2022-06-22 22:14:33 +0200 | <monochrom> | I thought the phone would be hot enough to warm your hands. Mine does. |
2022-06-22 22:15:04 +0200 | <geekosaur> | if you're relying on a unix filename being utf-8 you are sinning anyway |
2022-06-22 22:15:29 +0200 | <monochrom> | Yikes, I rely on utf-8 unix filenames all the time... |
2022-06-22 22:15:53 +0200 | <maerwald> | EvanR: I'm not sure about your proposal, I'll have to try a few examples |
2022-06-22 22:16:00 +0200 | <monochrom> | I have some Chinese filenames. Not going back to Big5. :) |
2022-06-22 22:16:01 +0200 | <geekosaur> | if yiou control those names it may be a safe assumption. until your backup program assumes latin-1… |
2022-06-22 22:16:30 +0200 | <monochrom> | Ah, true. Now I need to check that duplicity doesn't break my backup :) |
2022-06-22 22:16:31 +0200 | <maerwald> | EvanR: you mean basically bytestring that have all sorts of random surrogate chars... and whether pep 383 will choke on it? |
2022-06-22 22:16:50 +0200 | <EvanR> | maerwald, what I'm thinking of is impossible in practice... programs run on 1 OS at a time |
2022-06-22 22:17:00 +0200 | <EvanR> | as far as I know |
2022-06-22 22:17:13 +0200 | <geekosaur> | until windows decides to integrate wsl better |
2022-06-22 22:17:26 +0200 | <geekosaur> | (or goes back to the old posix subsystem stuff) |
2022-06-22 22:17:39 +0200 | <maerwald> | I ran a property test over the UTF-8 roundtrip encoding feeding it random bytestrings... it always roundtripped |
2022-06-22 22:18:06 +0200 | <EvanR> | I think issues with filepath exist way before we have such tech |
2022-06-22 22:19:12 +0200 | superbil | (~superbil@1-34-176-171.hinet-ip.hinet.net) |
2022-06-22 22:19:23 +0200 | werneta | (~werneta@137.78.30.207) |
2022-06-22 22:19:58 +0200 | z0k | (~z0k@206.84.141.12) (Ping timeout: 240 seconds) |
2022-06-22 22:20:06 +0200 | _ht | (~quassel@231-169-21-31.ftth.glasoperator.nl) (Remote host closed the connection) |
2022-06-22 22:20:12 +0200 | <monochrom> | EvanR: Continuing your crazy plan, we can re-define RPC to mean "relayed process control" meaning that you run a program on Windows and then you just suspend it and send its memory dump to a Linux host and resume running there. >:) |
2022-06-22 22:20:30 +0200 | <EvanR> | ah that might be a way |
2022-06-22 22:20:39 +0200 | <geekosaur> | isn't that where llvm came from? |
2022-06-22 22:20:49 +0200 | <monochrom> | Oh haha |
2022-06-22 22:20:49 +0200 | <geekosaur> | supercomputers want that tech |
2022-06-22 22:20:55 +0200 | eggplantade | (~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) |
2022-06-22 22:21:28 +0200 | <monochrom> | But blame it on ST:TNG for giving me that idea multiple times with its holodeck tricks. |
2022-06-22 22:21:52 +0200 | <monochrom> | holodeck+beaming tricks |
2022-06-22 22:22:23 +0200 | <EvanR> | moriarty I dare you to walk out that door |
2022-06-22 22:22:23 +0200 | shriekingnoise | (~shrieking@201.212.175.181) (Quit: Quit) |
2022-06-22 22:22:29 +0200 | <EvanR> | no problem |
2022-06-22 22:22:31 +0200 | <monochrom> | Heh |
2022-06-22 22:22:44 +0200 | shriekingnoise | (~shrieking@201.212.175.181) |
2022-06-22 22:23:22 +0200 | <geekosaur> | I stil want to know what numbskull didn't completely isolate those systems… |
2022-06-22 22:23:23 +0200 | <EvanR> | cogito ergo sum |
2022-06-22 22:23:48 +0200 | zeenk | (~zeenk@2a02:2f04:a301:3d00:39df:1c4b:8a55:48d3) |
2022-06-22 22:24:02 +0200 | <EvanR> | TNG's optimistic future has lax computer security |
2022-06-22 22:24:09 +0200 | <monochrom> | I'm a great tautologist. I'll one-up Descarte with: cogito ergo cogito. |
2022-06-22 22:24:17 +0200 | <k`> | Well, look, sometimes when we get transporter transmissions we get data that can't be encoded as physical matter. So we send it to the holodeck and see what it looks like... |
2022-06-22 22:24:21 +0200 | bitdex | (~bitdex@gateway/tor-sasl/bitdex) (Ping timeout: 268 seconds) |
2022-06-22 22:24:51 +0200 | <monochrom> | Clearly Data is encoded as physical matter. >:) |
2022-06-22 22:25:55 +0200 | <k`> | You really don't want to create an anti-Riker just because a few bits got flipped in the transporter. But at the same time, you don't want to drop all his information because it's invalid. |
2022-06-22 22:26:17 +0200 | <geekosaur> | that's what ecc is for |
2022-06-22 22:26:25 +0200 | jmdaemon | (~jmdaemon@user/jmdaemon) |
2022-06-22 22:26:34 +0200 | <EvanR> | anti-riker is impossible, how would you add a beard (because we shall not speak of anything featuring him without a beard) |
2022-06-22 22:26:36 +0200 | <geekosaur> | and fec, etc. |
2022-06-22 22:26:50 +0200 | <k`> | I think the Federation uses a patented Grey encoding. |
2022-06-22 22:27:33 +0200 | bitdex | (~bitdex@gateway/tor-sasl/bitdex) |
2022-06-22 22:27:55 +0200 | yrlnry | (~yrlnry@pool-108-2-150-109.phlapa.fios.verizon.net) (Remote host closed the connection) |
2022-06-22 22:28:06 +0200 | <EvanR> | wait is the transporter protocols and inevitable dramatic failures actually relevant to Filepath after all |
2022-06-22 22:28:21 +0200 | <monochrom> | Sorry! |
2022-06-22 22:28:47 +0200 | yrlnry | (~yrlnry@pool-108-2-150-109.phlapa.fios.verizon.net) |
2022-06-22 22:29:10 +0200 | <monochrom> | But I guess inevitable dramatic failures in general are relevant to everything including file paths. |
2022-06-22 22:29:30 +0200 | <k`> | Is `git-annex` creating beardless Rikers? |
2022-06-22 22:29:43 +0200 | <monochrom> | The whole point why people are talking about inevitable dramatic failures when using PEP-383 for file paths. |
2022-06-22 22:30:25 +0200 | <monochrom> | I think no matter what you use for file paths, you will have dramatic failures. |
2022-06-22 22:30:55 +0200 | mikoto-chan | (~mikoto-ch@esm-84-240-99-143.netplaza.fi) |
2022-06-22 22:31:12 +0200 | <maerwald> | the problem with PEP-383 is: 1. you lose the original encoding 2. it actually produces invalid UTF-8 in the strict sense |
2022-06-22 22:31:25 +0200 | <maerwald> | so if you run a strict UTF-8 converter over it, it fails |
2022-06-22 22:31:54 +0200 | <maerwald> | https://gist.github.com/hasufell/c600d318bdbe010a7841cc351c835f92#failure-6-re-encoding-pep-383-ut… |
2022-06-22 22:32:42 +0200 | <maerwald> | that's not a great property to have lol |
2022-06-22 22:32:53 +0200 | <k`> | maerwald: The whole point of 383 is to not use strict UTF8 and to have a reversible encoder so you never lose the original encoding. |
2022-06-22 22:32:55 +0200 | mc47 | (~mc47@xmonad/TheMC47) (Remote host closed the connection) |
2022-06-22 22:32:58 +0200 | yrlnry | (~yrlnry@pool-108-2-150-109.phlapa.fios.verizon.net) (Ping timeout: 240 seconds) |
2022-06-22 22:33:54 +0200 | <maerwald> | k`: yes, but it doesn't work well |
2022-06-22 22:33:57 +0200 | <monochrom> | So in Haskell or any language with sum types, we can go like "data Path = Decodable [Char] | Undecodable ByteString", and neither case distorts the data. |
2022-06-22 22:34:14 +0200 | <monochrom> | (Replace [] by any efficient sequence container type you like) |
2022-06-22 22:34:33 +0200 | <monochrom> | Things like PEP-383 are invented by people who are afraid of sum types. |
2022-06-22 22:34:38 +0200 | <maerwald> | k`: a call to `setFileSystemEncoding` can make it fail... serializing the String is unsafe, etc. |
2022-06-22 22:34:52 +0200 | <monochrom> | or even the class-subclass encoding of sum types. |
2022-06-22 22:34:58 +0200 | <EvanR> | how about | Unencodable [Char] xD |
2022-06-22 22:35:13 +0200 | lyle | (~lyle@104.246.145.85) (Quit: WeeChat 3.5) |
2022-06-22 22:35:19 +0200 | <geekosaur> | Char assumes Unicode codepoints. [Word8] |
2022-06-22 22:35:26 +0200 | <k`> | monochrom: So then when you append a Decodable prefix to an Undecodable filename you end up with an Undecodable path? |
2022-06-22 22:35:31 +0200 | <geekosaur> | which makes it just an inefficient ByteString |
2022-06-22 22:35:41 +0200 | <EvanR> | the utf16 surrogates... |
2022-06-22 22:35:46 +0200 | coot | (~coot@213.134.190.95) |
2022-06-22 22:35:56 +0200 | <monochrom> | I want Unicode codepoints, Haskell Char, in the Decodable case. |
2022-06-22 22:35:58 +0200 | <EvanR> | oh, you meant utf32 |
2022-06-22 22:35:59 +0200 | <geekosaur> | k`, what else could you end up with? |
2022-06-22 22:36:38 +0200 | <monochrom> | I think we should not allow that appending. |
2022-06-22 22:36:57 +0200 | <geekosaur> | probably the most correct solution |
2022-06-22 22:37:21 +0200 | <EvanR> | how can you deny the power of the / |
2022-06-22 22:37:22 +0200 | <geekosaur> | (granting that someone will want it, but in that case they should provide a Undecodable prefix) |
2022-06-22 22:38:17 +0200 | <monochrom> | Unpopular opinion: The whole point of PEP-383 is avoiding real sum types and rolling your own tagging. |
2022-06-22 22:38:25 +0200 | <EvanR> | if you have two valid paths, how could / not join them xD |
2022-06-22 22:38:45 +0200 | <monochrom> | Right? Use a high surrogate as tag for "I can't decode this byte, here is the byte itself" |
2022-06-22 22:38:59 +0200 | <monochrom> | In Haskell land we call it "Either Char Word8" |
2022-06-22 22:43:46 +0200 | <k`> | So is the string undecodable after the bad byte or just at the bad byte? |
2022-06-22 22:44:09 +0200 | <monochrom> | I guess my idea still doesn't answer the question of comparing a decodable path with an undecodable path, the latter being undecodable just because of misfortunate locale settings. |
2022-06-22 22:44:50 +0200 | <monochrom> | My idea declares the whole path undecodable. PEP-383 declares individual bytes undecodable. |
2022-06-22 22:45:26 +0200 | <monochrom> | I think there is no answer to that question. |
2022-06-22 22:45:40 +0200 | <maerwald> | the answer is: don't decode if you don't have to :p |
2022-06-22 22:45:51 +0200 | <maerwald> | and most of the time, you actually don't |
2022-06-22 22:46:07 +0200 | <monochrom> | Ah, right, I can stand behind that. |
2022-06-22 22:46:10 +0200 | <maerwald> | e.g. you don't need to understand the filename encoding when splitting filepaths |
2022-06-22 22:46:19 +0200 | <maerwald> | because the separator char '/' is well defined |
2022-06-22 22:46:24 +0200 | <maerwald> | and not encoding specific |
2022-06-22 22:46:30 +0200 | <maerwald> | you just scan and split, ignoring the rest |
2022-06-22 22:47:42 +0200 | nate4 | (~nate@98.45.169.16) |
2022-06-22 22:47:42 +0200 | takuan | (~takuan@178-116-218-225.access.telenet.be) (Remote host closed the connection) |
2022-06-22 22:47:48 +0200 | <k`> | You also need to check for an escaped '/', but I see what you're saying. |
2022-06-22 22:48:13 +0200 | <monochrom> | I think I haven't seen a file system that provides for an escaped / |
2022-06-22 22:48:14 +0200 | <maerwald> | k`: so would you rather see Haskell enforcing UTF-8 so that PEP 383 actually works *all the time*? |
2022-06-22 22:48:42 +0200 | <k`> | maerwald: Yes. |
2022-06-22 22:48:43 +0200 | <maerwald> | that would mean to ignore locale |
2022-06-22 22:48:48 +0200 | <monochrom> | or windows providing for an escaped \ |
2022-06-22 22:50:44 +0200 | <EvanR> | on mac typing / into the filename causes a fancy phantom / character from the astral plane to be used |
2022-06-22 22:50:52 +0200 | <geekosaur> | iirc namei() or equivalent is not per filesystem so there is no way to escape / regardless of filesystem |
2022-06-22 22:51:06 +0200 | <geekosaur> | back in the day that was converted to : |
2022-06-22 22:51:19 +0200 | <geekosaur> | and similarly : to / (whee ancient macos) |
2022-06-22 22:51:21 +0200 | <EvanR> | ascii / means / |
2022-06-22 22:51:36 +0200 | <maerwald> | k`: why not simply avoid roundtripping? |
2022-06-22 22:53:05 +0200 | Pickchea | (~private@user/pickchea) (Ping timeout: 256 seconds) |
2022-06-22 22:54:04 +0200 | jgeerds | (~jgeerds@55d45f48.access.ecotel.net) |
2022-06-22 22:56:02 +0200 | eggplantade | (~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) (Remote host closed the connection) |
2022-06-22 22:59:50 +0200 | Tuplanolla | (~Tuplanoll@91-159-69-97.elisa-laajakaista.fi) |
2022-06-22 23:01:44 +0200 | eggplantade | (~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) |
2022-06-22 23:04:11 +0200 | bitdex | (~bitdex@gateway/tor-sasl/bitdex) (Remote host closed the connection) |
2022-06-22 23:05:27 +0200 | coot | (~coot@213.134.190.95) (Quit: coot) |
2022-06-22 23:06:21 +0200 | bitdex | (~bitdex@gateway/tor-sasl/bitdex) |
2022-06-22 23:13:39 +0200 | pleo | (~pleo@user/pleo) (Read error: Connection reset by peer) |
2022-06-22 23:14:01 +0200 | pleo | (~pleo@user/pleo) |
2022-06-22 23:14:36 +0200 | MironZ3 | (~MironZ@nat-infra.ehlab.uk) |
2022-06-22 23:14:55 +0200 | yrlnry | (~yrlnry@pool-108-2-150-109.phlapa.fios.verizon.net) |
2022-06-22 23:14:57 +0200 | shriekingnoise | (~shrieking@201.212.175.181) (Quit: Quit) |
2022-06-22 23:15:16 +0200 | shriekingnoise | (~shrieking@201.212.175.181) |
2022-06-22 23:15:59 +0200 | MironZ | (~MironZ@nat-infra.ehlab.uk) (Quit: Ping timeout (120 seconds)) |
2022-06-22 23:15:59 +0200 | MironZ3 | MironZ |
2022-06-22 23:20:39 +0200 | rendar | (~Paxman@user/rendar) (Quit: Leaving) |
2022-06-22 23:25:23 +0200 | mikoto-chan | (~mikoto-ch@esm-84-240-99-143.netplaza.fi) (Ping timeout: 256 seconds) |
2022-06-22 23:25:59 +0200 | mikoto-chan | (~mikoto-ch@esm-84-240-99-143.netplaza.fi) |
2022-06-22 23:28:28 +0200 | misterfish | (~misterfis@ip214-130-173-82.adsl2.static.versatel.nl) (Ping timeout: 268 seconds) |
2022-06-22 23:28:59 +0200 | liz | (~liz@host86-159-158-175.range86-159.btcentralplus.com) |
2022-06-22 23:35:01 +0200 | mikoto-chan | (~mikoto-ch@esm-84-240-99-143.netplaza.fi) (Ping timeout: 244 seconds) |
2022-06-22 23:35:39 +0200 | bilegeek | (~bilegeek@2600:1008:b06f:8528:b8b4:9bf9:3a8:ef97) |
2022-06-22 23:45:18 +0200 | __monty__ | (~toonn@user/toonn) (Quit: leaving) |
2022-06-22 23:46:23 +0200 | <tomsmeding> | I believe it's still converted to : nowadays if you enter a / in Finder, or at least that worked a few years ago still |
2022-06-22 23:46:34 +0200 | <tomsmeding> | also / |
2022-06-22 23:47:43 +0200 | <EvanR> | yeah that thing |
2022-06-22 23:47:58 +0200 | <EvanR> | proprietary solidus |
2022-06-22 23:49:11 +0200 | nate4 | (~nate@98.45.169.16) (Ping timeout: 256 seconds) |
2022-06-22 23:51:14 +0200 | Qudit | (~user@user/Qudit) (Remote host closed the connection) |
2022-06-22 23:52:46 +0200 | <tomsmeding> | スラッシュ |
2022-06-22 23:53:06 +0200 | eggplantade | (~Eggplanta@2600:1700:bef1:5e10:99c9:a0a4:f69e:b22e) (Remote host closed the connection) |
2022-06-22 23:53:26 +0200 | justsomeguy | (~justsomeg@user/justsomeguy) |
2022-06-22 23:54:11 +0200 | gmg | (~user@user/gehmehgeh) (Quit: Leaving) |
2022-06-22 23:57:31 +0200 | michalz | (~michalz@185.246.204.107) (Remote host closed the connection) |