What does "incremental load" mean?

Incremental loading is used when moving data from one repository (Database) to another.

Non-incremental loading would be when the destination has the entire data from the source pushed to it.

Incremental would be only passing across the new and amended data.

A concrete example:

A company may have two platforms, one that processes orders, and a seperate accounting system. The accounts department enters new customer details into the accounting system but has to ensure these customers appear in the order processing system.

To do this it runs a nightly batch job that sends data from the accounting system to the order system.

If they were deleting all customer details in the order system and refilling with all the customers in the accounting system then they would be performing a non-incremental load.

If they only sent accross the new customers and the customers that had been changed they would be performing an incremental load.


It generally means only loading into the warehouse the records that have changed (inserts, updates, and deletes if applicable) since the last load; as opposed to doing a full load of all the data (all records, including those that haven't changed since the last load) into the warehouse.

The advantage is that it reduces the amount of data being transferred from system to system, as a full load may take hours / days to complete depending on volume of data.

The main disadvantage is around maintainability. With a full load, if there's an error you can re-run the entire load without having to do much else in the way of cleanup / preparation. With an incremental load, the files generally need to be loaded in order. So if you have a problem with one batch, others queue up behind it until you correct it. Alternately you may find an error in a batch from a few days ago, and need to re-load that batch once corrected, followed by every subsequent batch in order to ensure that the data in the warehouse is consistent.