No batch_size while making inference with BERT model

You are using SavedModelEstimator, which does not provide an option to pass in RunConfig or params arguments,

because the model function graph is defined statically in the SavedModel.

Since SavedModelEstimator is a subclass of Estimator, the params is merely a dictionary that stores hyperparameters. I think you could modify params by passing the desired (key,value) pair to it before you call getPrediction1. For example:

est = tf.contrib.estimator.SavedModelEstimator(MODEL_FILE_PATH)
est.params['batch_size'] = 1
predictions = getPrediction1(pred_sentences)