What does initial_epoch in Keras mean?

The answer above is correct however it is important to note that if you have trained for 10 epochs and set initial_epoch=10 and epochs=20 you train for 10 more epochs until you reach a total of 20 epochs. For example I trained for 2 epochs, then set initial_epoch=2 and epochs=4. The result is it trains for 4-2=2 more epochs. The new data in the history object starts at epoch 3. So the returned history object does start from epoch 1 as you might expect. Another words the state of the history object is not preserved from the initial training epochs. If you do not set initial_epoch and you train for 2 epochs, then rerun the fit_generator with epochs=4 it will train for 4 more epochs starting from the state preserved at the end of the second epoch (provided you use the built in optimizers). Again the history object state is NOT preserved from the initial training and only contains the data for the last 4 epochs. I noticed this because I plot the validation loss versus epochs.


Since in some of the optimizers, some of their internal values (e.g. learning rate) are set using the current epoch value, or even you may have (custom) callbacks that depend on the current value of epoch, the initial_epoch argument let you specify the initial value of epoch to start from when training.

As stated in the documentation, this is mostly useful when you have trained your model for some epochs, say 10, and then saved it and now you want to load it and resume the training for another 10 epochs without disrupting the state of epoch-dependent objects (e.g. optimizer). So you would set initial_epoch=10 (i.e. we have trained the model for 10 epochs) and epochs=20 (not 10, since the total number of epochs to reach is 20) and then everything resume as if you were initially trained the model for 20 epochs in one single training session.

However, note that when using built-in optimizers of Keras you don't need to use initial_epoch, since they store and update their state internally (without considering the value of current epoch) and also when saving a model the state of the optimizer will be stored as well.