Why CNN running in python is extremely slow in comparison to Matlab?

Maybe you could try to understand what part of the code takes a long time this way:

import onnx
from onnx_tf.backend import prepare 
import numpy as np
from PIL import Image 
import datetime

now = datetime.datetime.now()
onnx_model = onnx.load('trainednet.onnx')
tf_rep = prepare(onnx_model)
filepath = 'filepath.png' 
later = datetime.datetime.now()
difference = later - now
print("Loading time : %f ms" % (difference.microseconds / 1000))

img = Image.open(filepath).resize((224,224)).convert("RGB") 
img = array(img).transpose((2,0,1))
img = np.expand_dims(img, 0) 
img = img.astype(np.uint8) 

now = datetime.datetime.now()
probabilities = tf_rep.run(img) 
later = datetime.datetime.now()
difference = later - now
print("Prediction time : %f ms" % (difference.microseconds / 1000))
print(probabilities) 

Let me know what the output looks like :)


In this case, it appears that the Grapper optimization suite has encountered some kind of infinite loop or memory leak. I would recommend filing an issue against the Github repo.

It's challenging to debug why constant folding is taking so long, but you may have better performance using the ONNX TensorRT backend as compared to the TensorFlow backend. It achieves better performance as compared to the TensorFlow backend on Nvidia GPUs while compiling typical graphs more quickly. Constant folding usually doesn't provide large speedups for well optimized models.

import onnx
import onnx_tensorrt.backend as backend
import numpy as np

model = onnx.load("trainednet.onnx'")
engine = backend.prepare(model, device='CUDA:1')

filepath = 'filepath.png' 

img = Image.open(filepath).resize((224,224)).convert("RGB") 
img = array(img).transpose((2,0,1))
img = np.expand_dims(img, 0) 
img = img.astype(np.uint8) 
output_data = engine.run(img)[0]
print(output_data)