Assign ImageDataGenerator result to Numpy array

While using ImageDataGenerator, the data is loaded in the format of the directoryiterator. you can extract it as batches or as a whole

train_generator = train_datagen.flow_from_directory(
    train_parent_dir,
    target_size=(300, 300),
    batch_size=32,
    class_mode='categorical'
)

the output of which is

Found 3875 images belonging to 3 classes.

to extract as numpy array as a whole(which means not as a batch), this code can be used

x=np.concatenate([train_generator.next()[0] for i in range(train_generator.__len__())])
y=np.concatenate([train_generator.next()[1] for i in range(train_generator.__len__())])
print(x.shape)
print(y.shape)

NOTE:BEFORE THIS CODE IT IS ADVISED TO USE train_generator.reset()

the output of above code is

(3875, 300, 300, 3)
(3875, 3)

The output is obtained as a numpy array together, even though it was loaded as batches of 32 using ImageDataGenerator.

To get the output as batches use the following code

x=[]
y=[]
train_generator.reset()
for i in range(train_generator.__len__()):
   a,b=train_generator.next()
   x.append(a)
   y.append(b)
x=np.array(x)
y=np.array(y)
print(x.shape)
print(y.shape)

the output of the code is

(122,)
(122,)

Hope this works as a solution


I had the same problem and solved it the following way: itr.next returns the next batch of images as two numpy.ndarray objects: batch_x, batch_y. (Source: keras/preprocessing/image.py) So what you can do is set the batch_size for flow_from_directory to the size of your whole train dataset.

Example, my whole training set consists of 1481 images:

train_datagen = ImageDataGenerator(rescale=1. / 255)
itr = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=1481,
class_mode='categorical')

X, y = itr.next()