Description
I have the following objective function. Given an N by M matrix x, it computes the sum of squared norms between two adjacent rows. (For instance, If x is a sequence of 2D locations, then this function computes the length of total displacement.)
def min_displacement_obj(x):
"""
Computes:
T-1
sum ||x_{t+1} - x_{t}||^2
t=1
"""
return np.sum(np.array([np.linalg.norm(x[i] - x[i-1])**2
for i in range(1, len(x))]))
I'd like to compute the gradient of this function with respect to x, so I used autograd.grad
. But for the example below,
import autograd.numpy as np
from autograd import grad
f = min_displacement_obj
grad_f = grad(f)
x = np.array([[1],[0],[0],[1]])
grad_fval = grad_f(x)
print(grad_fval)
I get
[[ 2]
[-9223372036854775808]
[-9223372036854775808]
[ 2]]
This doesn't look right (unless I made silly mistake): If I manually compute the gradient of f
with respect to a component of x, say x[i], the gradient formula I get is:
df(x)/dx[i] = d(||x[i] - x[i-1]||^2 + ||x[i+1] - x[i]||^2)/dx = 4x[i] - 2x[i-1] - 2x[i+1]
Then, in the above example, when i=1, df(x)/dx[1] = 4*0 - 2*1 - 2*0 = -2. How come it becomes -9223372036854775808
? I'm very confused.
However, if I just change x to
x = np.array([[1],[0],[0.1],[1]])
I get
[[ 2. ]
[-2.2]
[-1.6]
[ 1.8]]
This is correct, because again when i=1, df(x)/dx[1] = 4*0 - 2*1 - 2*0.1 = -2.2
I found that autograd
doesn't compute the correct result when any of the adjacent components of x
have the same value. It just unanimously outputs -9223372036854775808
for the gradient w.r.t all those components. Is this a bug?