Optimiser performance problems

There’s another weird thing, for my more complex cost with 50k+ parameters, the adaMax function looks as if it leaks memory. Even though its doing the same update on every iteration and not intentionally accumulating anything, it just keeps eating memory until eventually its all gone.