Why is it that `input_shape` does not include the batch dimension when passed as an argument to the `Dense` layer?

You can specify the input shape of your model in several different ways. For example by providing one of the following arguments to the first layer of your model:

  • batch_input_shape: A tuple where the first dimension is the batch size.
  • input_shape: A tuple that does not include the batch size, e.g., the batch size is assumed to be None or batch_size, if specified.
  • input_dim: A scalar indicating the dimension of the input.

In all these cases, Keras is internally storing an attribute _batch_input_size to build the model.

Regarding the build method, my guess is that this is indeed a conscious choice - information about the batch size might be useful to build the model in some (perhaps unthought-of) situations. Therefore, a framework that includes the batch dimension as input to build is more generic and complete than a framework that doesn't. Nonetheless, I agree with you that naming the argument batch_input_shape instead of input_shape would make everything more consistent.


It is also worth mentioning that users rarely need to call the build method by themselves. This happens internally when it is needed. Nowadays, it is even possible to ignore the input_shape argument when creating the model (although methods like summary will then not work until the model is built). In this case, Keras is able to infer the input shape from the argument x of fit.