Help with autodiff and the ad package

emiruz · February 25, 2024, 7:37am

I’m trying to use the ad package to produce the gradient of an optimisation function, but the parameters of the function are dynamic (known at runtime) and featured throughout the function.

So far as I can tell, the function from Numeric.AD that I should use is grad. It allows me to do things like grad (\[x,y,z]-> x*y*z) or even grad (xs -> product xs).

However, the function I’m interested in looks like this:

(a_1*b_1 - v_1)^2 + (a2*b_1 - v2)^2 + (a_1*b2-v3)^2 + ...

Note that the a/b variables are re-used throughout, and how the a’s and b’s are re-used is decided at runtime.

If I have to pass in the a’s and b’s into the grad function as a list, it seems to me that I’d have to do a lot of O(N) indexing with !! to produce the calculation I need.

Is there a more efficient way to approach this?

For example, if there was a way to create symbolic variables, build a function which uses them, and then calculate the gradient vector of those variables, that would work. In Python, what I typically do is use a matrix with dummy coding to indicate which variables are used in any given row – that would also work.

rhendric · February 25, 2024, 7:44am

grad will work with any Traversable, so why not use a Vector or some other structure with O(1) indexing?

emiruz · February 25, 2024, 7:58am

I see, I wasn’t aware Vector was O(1) (I’m a noob), since things like IntMap and Map are not. I can now make some of my AoC code significantly faster too I think this will work, thank you.

tomjaguarpaw · February 25, 2024, 8:03am

You can also avoid indexing by defining your own specific record type

data Args a = Args { a_1 :: a, b_1 :: a, b2 :: a, ... }
   deriving (Foldable, Functor, Traversable)

emiruz · February 25, 2024, 8:08am

That’s useful to know, although it won’t work in my case because the variables are first known at runtime.

tomjaguarpaw · February 25, 2024, 8:08am

Ah, I see. Then a Vector sounds like a good option.

ocramz · February 25, 2024, 11:02am

Would it be correct to say that it’s the sparsity pattern of your cost function that is only known at runtime, whereas the total set of variables is fixed?

emiruz · February 25, 2024, 2:10pm

In my case, I’m making a little house pricing model – or more like translating it into Haskell as an exercise. There is a variable per house, and a variable for each month of the year. I can’t know the variables upfront because they are part of the data, so the variables themselves are not fixed.

Meanwhile the cost function (based on actual sale prices of houses at points in time compared to the predicted price) is just a sum of terms, wherein each term uses only 2 of the total set of variables (one house variable and one time variable).

Knowing that vector has O(1) lookup, I can pre-calculate all the indexes and then use ! to constantly look up the variables.

Topic		Replies	Views
Optimiser performance problems Learn	31	2334	April 30, 2024
Optimizations with enumerations Learn	14	787	October 9, 2023
Subquadratic behaviour of Data.List.nub Learn	9	621	April 10, 2023
Using the Accelerate library to implement Chebyshev Functions Links	0	287	July 30, 2019
How to cook with SIMD? Learn	8	954	August 26, 2024

Help with autodiff and the ad package

Related topics