Newest at the top
2025-03-21 11:20:41 +0100 | <Athas> | tomsmeding: I gave it a shot yesterday, but received some type errors I couldn't figure out. Maybe I will try again. |
2025-03-21 11:18:11 +0100 | <tomsmeding> | Athas: if you haven't yet, I recommend implementing 'ff' by Forward over ForwardDouble, I suspect it'll help quite a bit |
2025-03-21 11:17:50 +0100 | mniip | (mniip@libera/staff/mniip) (Ping timeout: 604 seconds) |
2025-03-21 11:17:16 +0100 | <tomsmeding> | perhaps that just introduces busywork here? I don't know |
2025-03-21 11:16:53 +0100 | <tomsmeding> | oh, the zero is relevant if you're doing nested AD I guess |
2025-03-21 11:16:33 +0100 | <tomsmeding> | the Forward in 'ad' is a sum type with special cases for zero (not sure why?) and constants |
2025-03-21 11:15:13 +0100 | <Athas> | Yes, forward mode is better here, but it is still slow. |
2025-03-21 11:15:04 +0100 | <tomsmeding> | I have no clue juts from looking at the code; I would perhaps profile to see if there's anything surprising, but it's bound to produce noise here |
2025-03-21 11:14:12 +0100 | <tomsmeding> | the input to f is also only length 2, so doing forward mode twice has a chance of being competitive with reverse AD |
2025-03-21 11:14:08 +0100 | <Athas> | Yes. |
2025-03-21 11:14:02 +0100 | xff0x | (~xff0x@fsb6a9491c.tkyc517.ap.nuro.jp) (Ping timeout: 272 seconds) |
2025-03-21 11:13:08 +0100 | <tomsmeding> | the closest match is 'ff'? |
2025-03-21 11:13:01 +0100 | <tomsmeding> | oh, right |
2025-03-21 11:12:58 +0100 | <Athas> | The hand-written code only has forward-over-forward. |
2025-03-21 11:12:46 +0100 | <Athas> | I have done all variants in my 'ad' code. |
2025-03-21 11:12:24 +0100 | <tomsmeding> | I was looking at the stalingrad example, but it seems you've implemented that with a proper reverse-mode gradient |
2025-03-21 11:11:41 +0100 | <Athas> | tomsmeding: in which program? |
2025-03-21 11:10:51 +0100 | <tomsmeding> | is that intentional? |
2025-03-21 11:10:48 +0100 | <tomsmeding> | Athas: I see a 'gradient' function that uses forward AD |
2025-03-21 11:08:36 +0100 | gmg | (~user@user/gehmehgeh) (Ping timeout: 264 seconds) |
2025-03-21 11:07:05 +0100 | <tomsmeding> | I like this blast-to-the-past Haskell style |
2025-03-21 11:06:38 +0100 | alfiee | (~alfiee@user/alfiee) (Ping timeout: 245 seconds) |
2025-03-21 11:06:15 +0100 | <Athas> | Forward over Forward. |
2025-03-21 11:06:10 +0100 | <tomsmeding> | 'ad' with Forward over Forward, or Forward over ForwardDouble? |
2025-03-21 11:05:45 +0100 | <tomsmeding> | lol |
2025-03-21 11:05:42 +0100 | <Athas> | Sure, but it is already faster than 'ad'. |
2025-03-21 11:05:41 +0100 | <tomsmeding> | but let me read |
2025-03-21 11:05:31 +0100 | <tomsmeding> | that will likely be faster if you `data Bundle = Bundle {-# UNPACK #-} !Double {-# UNPACK #-} !Double` |
2025-03-21 11:05:24 +0100 | <Athas> | But it is easy to fix. |
2025-03-21 11:05:18 +0100 | <Athas> | There are also n+k patterns. |
2025-03-21 11:05:13 +0100 | <Athas> | Yes, it is aaaalmost working Haskell. |
2025-03-21 11:04:42 +0100 | <tomsmeding> | ooh, DatatypeContexts |
2025-03-21 11:04:26 +0100 | <Athas> | And the dual numbers: https://engineering.purdue.edu/~qobi/stalingrad-examples2009/common-ghc.html |
2025-03-21 11:04:14 +0100 | <Athas> | This is the ad hoc version: https://engineering.purdue.edu/~qobi/stalingrad-examples2009/particle-FF-ghc.html |
2025-03-21 11:04:14 +0100 | <Athas> | This is my code: https://github.com/gradbench/gradbench/blob/4fdb8cc00daaae42b99431fde3da7be1b1bbbc13/tools/haskell… |
2025-03-21 11:02:35 +0100 | <tomsmeding> | does that help? |
2025-03-21 11:02:28 +0100 | <tomsmeding> | the former you get with reverse AD, which is more complicated |
2025-03-21 11:02:25 +0100 | alfiee | (~alfiee@user/alfiee) alfiee |
2025-03-21 11:02:09 +0100 | <tomsmeding> | the typical dual-numbers formulation gives you the _latter_, whereas you usually (but not always) want the former |
2025-03-21 11:01:49 +0100 | <tomsmeding> | the problem is that you can do so efficiently for a function of type R^n -> R, or for a function of type R -> R^n |
2025-03-21 11:01:31 +0100 | <tomsmeding> | __monty__: it is |
2025-03-21 11:01:21 +0100 | <__monty__> | (I'd appreciate another recap of what AD is. It's not just a way to numerically compute derivatives of numerical functions, is it? Feel free to leave the recap for when the discussion is more or less over.) |
2025-03-21 11:00:34 +0100 | <tomsmeding> | can you share your code with the manual dual numbers? I'm curious to see what beats `ad` |
2025-03-21 11:00:14 +0100 | <tomsmeding> | few people are |
2025-03-21 11:00:04 +0100 | <Athas> | I've realised I'm not good at fast Haskell. |
2025-03-21 10:59:46 +0100 | <Athas> | Well, it's not so easy - I need actual nested AD. |
2025-03-21 10:59:28 +0100 | <tomsmeding> | did you try Numeric.AD.Mode.Tower(.Double)? It purports to be higher-order forward derivatives |
2025-03-21 10:59:03 +0100 | <tomsmeding> | forward in `ad` is just a dual number, so that's rather surprising |
2025-03-21 10:58:43 +0100 | <Athas> | Forward-over-forward. And it's slower than just hacking up your own dual numbers. |
2025-03-21 10:58:29 +0100 | <tomsmeding> | Athas: which mode did you use? |