"RuntimeError: Expected 4-dimensional input for 4-dimensional weight 32 3 3, but got 3-dimensional input of size [3, 224, 224] instead"?

As Usman Ali wrote in his comment, pytorch (and most other DL toolboxes) expects a batch of images as an input. Thus you need to call

output = model(data[None, ...])  

Inserting a singleton "batch" dimension to your input data.

Please also note that the model you are using might expect a different input size (3x229x229) and not 3x224x224.


From the Pytorch documentation on convolutional layers, Conv2d layers expect input with the shape

(n_samples, channels, height, width) # e.g., (1000, 1, 224, 224)

Passing grayscale images in their usual format (224, 224) won't work.

To get the right shape, you will need to add a channel dimension. You can do it as follows:

x = np.expand_dims(x, 1)      # if numpy array
tensor = tensor.unsqueeze(1)  # if torch tensor

The unsqueeze() method adds a dimensions at the specified index. The result would have shape:

(1000, 1, 224, 224)