2025/03/21

Newest at the top

2025-03-21 11:20:41 +0100 <Athas> tomsmeding: I gave it a shot yesterday, but received some type errors I couldn't figure out. Maybe I will try again.
2025-03-21 11:18:11 +0100 <tomsmeding> Athas: if you haven't yet, I recommend implementing 'ff' by Forward over ForwardDouble, I suspect it'll help quite a bit
2025-03-21 11:17:50 +0100mniip(mniip@libera/staff/mniip) (Ping timeout: 604 seconds)
2025-03-21 11:17:16 +0100 <tomsmeding> perhaps that just introduces busywork here? I don't know
2025-03-21 11:16:53 +0100 <tomsmeding> oh, the zero is relevant if you're doing nested AD I guess
2025-03-21 11:16:33 +0100 <tomsmeding> the Forward in 'ad' is a sum type with special cases for zero (not sure why?) and constants
2025-03-21 11:15:13 +0100 <Athas> Yes, forward mode is better here, but it is still slow.
2025-03-21 11:15:04 +0100 <tomsmeding> I have no clue juts from looking at the code; I would perhaps profile to see if there's anything surprising, but it's bound to produce noise here
2025-03-21 11:14:12 +0100 <tomsmeding> the input to f is also only length 2, so doing forward mode twice has a chance of being competitive with reverse AD
2025-03-21 11:14:08 +0100 <Athas> Yes.
2025-03-21 11:14:02 +0100xff0x(~xff0x@fsb6a9491c.tkyc517.ap.nuro.jp) (Ping timeout: 272 seconds)
2025-03-21 11:13:08 +0100 <tomsmeding> the closest match is 'ff'?
2025-03-21 11:13:01 +0100 <tomsmeding> oh, right
2025-03-21 11:12:58 +0100 <Athas> The hand-written code only has forward-over-forward.
2025-03-21 11:12:46 +0100 <Athas> I have done all variants in my 'ad' code.
2025-03-21 11:12:24 +0100 <tomsmeding> I was looking at the stalingrad example, but it seems you've implemented that with a proper reverse-mode gradient
2025-03-21 11:11:41 +0100 <Athas> tomsmeding: in which program?
2025-03-21 11:10:51 +0100 <tomsmeding> is that intentional?
2025-03-21 11:10:48 +0100 <tomsmeding> Athas: I see a 'gradient' function that uses forward AD
2025-03-21 11:08:36 +0100gmg(~user@user/gehmehgeh) (Ping timeout: 264 seconds)
2025-03-21 11:07:05 +0100 <tomsmeding> I like this blast-to-the-past Haskell style
2025-03-21 11:06:38 +0100alfiee(~alfiee@user/alfiee) (Ping timeout: 245 seconds)
2025-03-21 11:06:15 +0100 <Athas> Forward over Forward.
2025-03-21 11:06:10 +0100 <tomsmeding> 'ad' with Forward over Forward, or Forward over ForwardDouble?
2025-03-21 11:05:45 +0100 <tomsmeding> lol
2025-03-21 11:05:42 +0100 <Athas> Sure, but it is already faster than 'ad'.
2025-03-21 11:05:41 +0100 <tomsmeding> but let me read
2025-03-21 11:05:31 +0100 <tomsmeding> that will likely be faster if you `data Bundle = Bundle {-# UNPACK #-} !Double {-# UNPACK #-} !Double`
2025-03-21 11:05:24 +0100 <Athas> But it is easy to fix.
2025-03-21 11:05:18 +0100 <Athas> There are also n+k patterns.
2025-03-21 11:05:13 +0100 <Athas> Yes, it is aaaalmost working Haskell.
2025-03-21 11:04:42 +0100 <tomsmeding> ooh, DatatypeContexts
2025-03-21 11:04:26 +0100 <Athas> And the dual numbers: https://engineering.purdue.edu/~qobi/stalingrad-examples2009/common-ghc.html
2025-03-21 11:04:14 +0100 <Athas> This is the ad hoc version: https://engineering.purdue.edu/~qobi/stalingrad-examples2009/particle-FF-ghc.html
2025-03-21 11:04:14 +0100 <Athas> This is my code: https://github.com/gradbench/gradbench/blob/4fdb8cc00daaae42b99431fde3da7be1b1bbbc13/tools/haskell…
2025-03-21 11:02:35 +0100 <tomsmeding> does that help?
2025-03-21 11:02:28 +0100 <tomsmeding> the former you get with reverse AD, which is more complicated
2025-03-21 11:02:25 +0100alfiee(~alfiee@user/alfiee) alfiee
2025-03-21 11:02:09 +0100 <tomsmeding> the typical dual-numbers formulation gives you the _latter_, whereas you usually (but not always) want the former
2025-03-21 11:01:49 +0100 <tomsmeding> the problem is that you can do so efficiently for a function of type R^n -> R, or for a function of type R -> R^n
2025-03-21 11:01:31 +0100 <tomsmeding> __monty__: it is
2025-03-21 11:01:21 +0100 <__monty__> (I'd appreciate another recap of what AD is. It's not just a way to numerically compute derivatives of numerical functions, is it? Feel free to leave the recap for when the discussion is more or less over.)
2025-03-21 11:00:34 +0100 <tomsmeding> can you share your code with the manual dual numbers? I'm curious to see what beats `ad`
2025-03-21 11:00:14 +0100 <tomsmeding> few people are
2025-03-21 11:00:04 +0100 <Athas> I've realised I'm not good at fast Haskell.
2025-03-21 10:59:46 +0100 <Athas> Well, it's not so easy - I need actual nested AD.
2025-03-21 10:59:28 +0100 <tomsmeding> did you try Numeric.AD.Mode.Tower(.Double)? It purports to be higher-order forward derivatives
2025-03-21 10:59:03 +0100 <tomsmeding> forward in `ad` is just a dual number, so that's rather surprising
2025-03-21 10:58:43 +0100 <Athas> Forward-over-forward. And it's slower than just hacking up your own dual numbers.
2025-03-21 10:58:29 +0100 <tomsmeding> Athas: which mode did you use?