Keras BFGS training using Scipy minimize

Is it because I didn't input the gradient to minimize, and it cannot calculate the numerical approximation in this case?

It's because you don't output the gradients, so scipy approximates them by numerical differentiation. That is it evaluate the function at X, then at X + epsilon, to approximate the local gradient.

But the epsilon is small enough that in the conversion to 32bit for theano, the change is completely lost. The starting guess is not in fact a minimum, scipy just thinks so since it sees no change in value in the objective function. You simply need to increase the epsilon as such:

V = [1.0, 2.0, 3.0, 4.0, 1.0]
print('Starting loss = {}'.format(loss(V)))
# set the eps option to increase the epsilon used in numerical diff
res = minimize(loss, x0=V, method = 'BFGS', options={'eps':1e-6,'disp':True})
print('Ending loss = {}'.format(loss(res.x)))

Which gives:

Using Theano backend.
Starting loss = 2.49976992001
Optimization terminated successfully.
         Current function value: 1.002703
         Iterations: 19
         Function evaluations: 511
         Gradient evaluations: 73
Ending loss = 1.00270344184