Confusion about keras Model: __call__ vs. call vs. predict methods
Adding to @Dmitry Kabanov, they are similar yet they aren't exactly the same thing. If you care about performance, need to look in to critical differences between them.
|loops over the data in batches which means means that predict() calls can scale to very large arrays.||happens in-memory and doesn't scale|
|use this if you just need the output value||use this when you need to retrieve the gradients|
|Output is NumPy value||Output is a Tensor|
|use this if you have batches of data to be predicted||use this for small dataset|
|relatively slower for small data||relatively faster for small data|
Please check more detailed explanation in Keras FAQs
Just to complement the answer as I was also searching for this. When you need to specify the training flag of the model for the inference phase, such as,
model(X_new, training=False) when you have a batch normalization layer, for example, both
predict_on_batch already do that when they are executed.
model(X_new, training=False) and
model.predict_on_batch(X_new) are equivalent.
The difference between
predict_on_batch is that the latter runs over a single batch, and the former runs over a dataset that is splitted into batches and the results merged to produce the final numpy array of predictions.
Beyond the difference mentioned by @Dmitry Kabanov, the functions generate different types of output,
__call__ generates a Tensor, and
according to the documentation,
__call__ is faster than the
predict function for small scale inputs, i.e., which fit in one batch.