2025/01/22

Newest at the top

2025-01-22 21:09:21 +0100ColinRobinsonJuanDaugherty
2025-01-22 21:08:18 +0100simplystuart(~simplystu@c-75-75-152-164.hsd1.pa.comcast.net) (Ping timeout: 276 seconds)
2025-01-22 21:06:07 +0100alfiee(~alfiee@user/alfiee) (Ping timeout: 244 seconds)
2025-01-22 21:04:33 +0100nhar(~noah@host-68-169-128-200.BROOLT1.epbfi.com) (Ping timeout: 248 seconds)
2025-01-22 21:02:57 +0100visilii(~visilii@188.254.110.9) (Ping timeout: 248 seconds)
2025-01-22 21:01:48 +0100alfiee(~alfiee@user/alfiee) alfiee
2025-01-22 21:01:48 +0100lxsameer(lxsameer@Serene/lxsameer) (Ping timeout: 276 seconds)
2025-01-22 21:00:41 +0100caconym(~caconym@user/caconym) caconym
2025-01-22 21:00:04 +0100caconym(~caconym@user/caconym) (Quit: bye)
2025-01-22 20:59:46 +0100 <dminuoso> RDBMs tend to get this more right than most programming languages.
2025-01-22 20:59:36 +0100 <tomsmeding> I've had the luxury so far of not working on applications where that's actually important
2025-01-22 20:59:33 +0100visilii_(~visilii@213.24.126.57)
2025-01-22 20:59:16 +0100 <tomsmeding> right
2025-01-22 20:59:10 +0100 <dminuoso> There can be so many notions of equality with string-like data.
2025-01-22 20:59:00 +0100 <dminuoso> tomsmeding: Honestly the topic of strings and equality is an annoying business, because "collation" does not refer to a particular authoritative strategy either.
2025-01-22 20:58:28 +0100 <tomsmeding> took some time checking that I was doing the right thing because my terminal isn't rendering the uncombined form -.-
2025-01-22 20:57:42 +0100 <tomsmeding> that's a + combining diaeresis, versus the pre-combined form
2025-01-22 20:57:29 +0100 <dminuoso> And even if not, strided prefetchers will trigger (I dont know the assembly generated by compareByteArrays# however)
2025-01-22 20:57:28 +0100 <lambdabot> False
2025-01-22 20:57:26 +0100 <tomsmeding> > "\x61\x308" == "\xe4"
2025-01-22 20:57:04 +0100 <dminuoso> Then memcmp is *definitely* faster. Depending on alignment it could a tight loop on a single cache line.
2025-01-22 20:55:59 +0100 <dminuoso> https://hackage.haskell.org/package/text-2.1.2/docs/src/Data.Text.Array.html#compareInternal - so this ends up using memcmp or compareByteArrays# after all
2025-01-22 20:55:28 +0100 <dminuoso> Guessimport qualified Data.Text.Array as A
2025-01-22 20:55:20 +0100 <dminuoso> | lenA == lenB = A.equal arrA offA arrB offB lenA
2025-01-22 20:54:56 +0100ash3en(~Thunderbi@ip1f10cbd6.dynamic.kabel-deutschland.de) (Client Quit)
2025-01-22 20:54:51 +0100 <dminuoso> I dont.. know?
2025-01-22 20:54:41 +0100 <dminuoso> But the question is very good actually
2025-01-22 20:54:30 +0100 <dminuoso> Text contains unicode
2025-01-22 20:54:27 +0100 <tomsmeding> but this is tangential to the point; is the answer to your original question answered by "lookup and ((lookup .) . map swap)"?
2025-01-22 20:54:25 +0100 <dminuoso> tomsmeding: Uh, what about collation?
2025-01-22 20:54:02 +0100 <tomsmeding> isn't equality on Text byte-equality, hence memcmp(), hence quite fast?
2025-01-22 20:54:00 +0100 <dminuoso> The strings are all short (10-15ish) however
2025-01-22 20:53:34 +0100 <dminuoso> The biggest cost will be equality on the Text probably. ;)
2025-01-22 20:53:28 +0100ash3en(~Thunderbi@ip1f10cbd6.dynamic.kabel-deutschland.de) ash3en
2025-01-22 20:52:45 +0100 <Rembane> Or maybe a vector?
2025-01-22 20:52:40 +0100 <tomsmeding> but linear search is definitely quite fine, yes
2025-01-22 20:52:30 +0100 <tomsmeding> a map might already be faster at that point, because it's not linear search in an array but in a linked list
2025-01-22 20:52:21 +0100 <dminuoso> Linear search is probably faster in all cases.
2025-01-22 20:52:04 +0100 <dminuoso> No bigger than 50
2025-01-22 20:52:03 +0100 <tomsmeding> complexity analysis is about the worst case. :)
2025-01-22 20:51:54 +0100weary-traveler(~user@user/user363627) (Remote host closed the connection)
2025-01-22 20:51:43 +0100 <tomsmeding> what about the remaining 5%?
2025-01-22 20:51:30 +0100 <dminuoso> Id have to do some statistical analysis, but I would say 95% of them have less than 15 elements.
2025-01-22 20:51:25 +0100 <tomsmeding> ((lookup .) . map swap), rather
2025-01-22 20:51:07 +0100 <tomsmeding> in which case, lookup and (lookup . map swap)?
2025-01-22 20:50:55 +0100 <tomsmeding> unless the lists are small enough that you want to do linear search
2025-01-22 20:50:44 +0100 <tomsmeding> dminuoso: I agree. But this does feel like it would benefit from a data structure that maintains the invariant
2025-01-22 20:50:06 +0100 <dminuoso> tomsmeding: Not a big fan of depending on packages for a small isolated problem.
2025-01-22 20:48:04 +0100srazkvt(~sarah@user/srazkvt) (Quit: Konversation terminated!)
2025-01-22 20:47:50 +0100alecs(~alecs@61.pool85-58-154.dynamic.orange.es) alecs