2025/11/13

Newest at the top

2025-11-13 11:59:49 +0100 <kuribas`> right :)
2025-11-13 11:59:38 +0100 <[exa]> kuribas`: anyway you might have notice that you hit 2 professional hashmap haters today
2025-11-13 11:59:34 +0100 <kuribas`> [exa]: what if I care about time more than space?
2025-11-13 11:59:25 +0100 <merijn> https://sqlite.org/stricttables.html
2025-11-13 11:59:13 +0100 <merijn> kuribas`: Have you not heard the glorious news of SQLite STRICT mode? :p
2025-11-13 11:58:40 +0100 <[exa]> kuribas`: no it requires a trie, hashmaps waste space for interning
2025-11-13 11:58:34 +0100 <kuribas`> I have sqlite in my python project, and I feel its just now untyped queries instead of typed data processing...
2025-11-13 11:57:40 +0100 <merijn> All I will say is that I've switched to SQLite from whatever I was using 3 times in projects, and every single time I have the same epiphany :p Which is, that I should use more SQLite in everything :p
2025-11-13 11:56:43 +0100 <kuribas`> merijn: yeah, but the interning needs another hashmap.
2025-11-13 11:56:22 +0100 <merijn> kuribas`: Naah, with interned strings you could just use a tree map, since then [exa] comment about repeated compares goes away :p
2025-11-13 11:55:55 +0100 <kuribas`> Sure I can intern all the strings, that would just be another hashmap...
2025-11-13 11:55:44 +0100 <merijn> kuribas`: tbh 1) it's probably worth figuring out how to do it in SQLite anyway, 2) if you can turn it into a JSON encoding you can just store (and query) json blobs in SQLite :p
2025-11-13 11:55:29 +0100 <kuribas`> and strings for addresses, driver names, device models, etc..
2025-11-13 11:55:12 +0100poscat0x04(~poscat@user/poscat) (Ping timeout: 256 seconds)
2025-11-13 11:54:59 +0100 <kuribas`> with entries like ("foo", *, 2)
2025-11-13 11:54:46 +0100 <kuribas`> That doesn't go into SQL easily.
2025-11-13 11:54:42 +0100 <kuribas`> I have a nested dictionary with wildcards.
2025-11-13 11:54:12 +0100 <merijn> At that point, just dump everything into SQLite
2025-11-13 11:54:07 +0100 <[exa]> y a p
2025-11-13 11:53:42 +0100 <merijn> [exa]: Sure, but if you've got enough string keys for that to matter you should probably rethink your approach anyway :p
2025-11-13 11:53:08 +0100poscat(~poscat@user/poscat) poscat
2025-11-13 11:52:49 +0100 <[exa]> no, trie
2025-11-13 11:52:38 +0100 <kuribas`> [exa]: You build a sorted vector first?
2025-11-13 11:52:36 +0100 <[exa]> merijn: in the "naive" tree case the strings are super annoying (generic tree algorithms do repeated compares on shared key prefixes etc)
2025-11-13 11:51:25 +0100 <[exa]> s/strings/variable-width keys/
2025-11-13 11:51:12 +0100 <[exa]> kuribas`: the main point is that if you really want an index over strings with only Eq possible, you're likely pushed yourself into over-generalizing the situation, and you should use integers instead of the strings
2025-11-13 11:49:18 +0100 <merijn> Better worst case complexity, better space complexity, more flexible queries, negligible performance difference in 95% of scenarios (that might be a conservative percentage)
2025-11-13 11:48:18 +0100merijnwill takes a tree based Map over a hashmap any day
2025-11-13 11:48:12 +0100deptype_(~deptype@2406:b400:3a:73c2:d739:473:9e2d:bf26)
2025-11-13 11:47:58 +0100deptype_(~deptype@2406:b400:3a:73c2:ebd8:a6e4:ac56:ebb9) (Remote host closed the connection)
2025-11-13 11:47:13 +0100 <merijn> kuribas`: I mean, why would a hashmap be better at strings than a tree based map?
2025-11-13 11:46:45 +0100 <merijn> kuribas`: Even then, meh
2025-11-13 11:40:50 +0100 <kuribas`> [exa]: The trie is slower in that benchmark
2025-11-13 11:39:23 +0100 <[exa]> kuribas`: tries? inverted indices?
2025-11-13 11:39:13 +0100 <[exa]> merijn: like, the immutability shouldn't be an issue at all in that precise benchmark, it's select-only
2025-11-13 11:38:38 +0100 <kuribas`> [exa]: how else, if you have strings?
2025-11-13 11:38:18 +0100 <[exa]> people who index stuff with unstructured strings truly deserve hashmaps
2025-11-13 11:33:37 +0100 <kuribas`> merijn: good for strings?
2025-11-13 11:31:58 +0100 <merijn> In generaly I'm a well-known hash map hater, though. Waaaay overrated data structure
2025-11-13 11:31:36 +0100 <merijn> kuribas: I mean, immutable hashmaps seems like a worst of all worlds
2025-11-13 11:31:33 +0100merijn(~merijn@77.242.116.146) merijn
2025-11-13 11:28:14 +0100deptype_(~deptype@2406:b400:3a:73c2:ebd8:a6e4:ac56:ebb9)
2025-11-13 11:27:56 +0100deptype_(~deptype@2406:b400:3a:73c2:796f:1d1b:ab7f:a73f) (Remote host closed the connection)
2025-11-13 11:24:04 +0100kuribas(~user@2a02:1808:67:a09:b55b:215:13f6:6a3b) (Ping timeout: 255 seconds)
2025-11-13 11:22:40 +0100 <[exa]> anyway I assume the table building has leaked into the benchmark for the non-IO variant, half a second for selection in whimsy 1M table is....tooo much.
2025-11-13 11:22:16 +0100kuribas`(~user@ip-188-118-57-242.reverse.destiny.be) kuribas
2025-11-13 11:20:25 +0100merijn(~merijn@77.242.116.146) (Ping timeout: 240 seconds)
2025-11-13 11:20:14 +0100trickard_(~trickard@cpe-62-98-47-163.wireline.com.au)
2025-11-13 11:20:01 +0100trickard__(~trickard@cpe-62-98-47-163.wireline.com.au) (Read error: Connection reset by peer)
2025-11-13 11:17:57 +0100 <kuribas> right