How does Keras load_data() know what part of the data is the train and test set?

The best way to find out is looking at Kera's code:

def load_data(path='mnist.npz'):
    path = get_file(path, origin='https://s3.amazonaws.com/img-datasets/mnist.npz', file_hash='8a61469f7ea1b51cbae51d4f78837e45')
    with np.load(path, allow_pickle=True) as f:
        x_train, y_train = f['x_train'], f['y_train']
        x_test, y_test = f['x_test'], f['y_test']
    return (x_train, y_train), (x_test, y_test)

You can see basically is downloading a file which contains the dataset, which is already separated in train and test data. The only parameter (path) is basically where to store the downloaded dataset.

Tags:

Keras