Help with autodiff and the ad package

I’m trying to use the ad package to produce the gradient of an optimisation function, but the parameters of the function are dynamic (known at runtime) and featured throughout the function.

So far as I can tell, the function from Numeric.AD that I should use is grad. It allows me to do things like grad (\[x,y,z]-> x*y*z) or even grad (xs -> product xs).

However, the function I’m interested in looks like this:

(a_1*b_1 - v_1)^2 + (a2*b_1 - v2)^2 + (a_1*b2-v3)^2 + ...

Note that the a/b variables are re-used throughout, and how the a’s and b’s are re-used is decided at runtime.

If I have to pass in the a’s and b’s into the grad function as a list, it seems to me that I’d have to do a lot of O(N) indexing with !! to produce the calculation I need.

Is there a more efficient way to approach this?

For example, if there was a way to create symbolic variables, build a function which uses them, and then calculate the gradient vector of those variables, that would work. In Python, what I typically do is use a matrix with dummy coding to indicate which variables are used in any given row – that would also work.

grad will work with any Traversable, so why not use a Vector or some other structure with O(1) indexing?

2 Likes

I see, I wasn’t aware Vector was O(1) (I’m a noob), since things like IntMap and Map are not. I can now make some of my AoC code significantly faster too :slight_smile: I think this will work, thank you.

2 Likes

You can also avoid indexing by defining your own specific record type

data Args a = Args { a_1 :: a, b_1 :: a, b2 :: a, ... }
   deriving (Foldable, Functor, Traversable)
1 Like

That’s useful to know, although it won’t work in my case because the variables are first known at runtime.

1 Like

Ah, I see. Then a Vector sounds like a good option.

Would it be correct to say that it’s the sparsity pattern of your cost function that is only known at runtime, whereas the total set of variables is fixed?

In my case, I’m making a little house pricing model – or more like translating it into Haskell as an exercise. There is a variable per house, and a variable for each month of the year. I can’t know the variables upfront because they are part of the data, so the variables themselves are not fixed.

Meanwhile the cost function (based on actual sale prices of houses at points in time compared to the predicted price) is just a sum of terms, wherein each term uses only 2 of the total set of variables (one house variable and one time variable).

Knowing that vector has O(1) lookup, I can pre-calculate all the indexes and then use ! to constantly look up the variables.