How do I create padded batches in Tensorflow for tf.train.SequenceExample data using the DataSet API?

You need to pass a tuple of shapes. In your case you should pass

dataset = dataset.padded_batch(4, padded_shapes=([vectorSize],[None]))

or try

dataset = dataset.padded_batch(4, padded_shapes=([None],[None]))

Check this code for more details. I had to debug this method to figure out why it wasn't working for me.


If your current Dataset object contains a tuple, you can also to specify the shape of each padded element.

For example, I have a (same_sized_images, Labels) dataset and each label has different length but same rank.

def process_label(resized_img, label):
    # Perfrom some tensor transformations
    # ......

    return resized_img, label

dataset = dataset.map(process_label)
dataset = dataset.padded_batch(batch_size, 
                               padded_shapes=([None, None, 3], 
                                              [None, None]))  # my label has rank 2