Drupal - How Batch Works Around the PHP Timeout

Batch simply says "I will do N number of things (at most) and then do a page refresh ... and do more."

If you say do 5 items per job chunk that take 5 seconds each you would be fine with the default php timeout value of 30 seconds.

If you say do 20 items per job chunk that take 5 seconds each, your workload per request is too high and will likely timeout.

Remember when your in PHP the lifecycle of a page is request in -> response out. And that your webserver keeps each thread alive for a finite amount of time. You have to work around that timeout -- which the Batch API helps you do.

Running stuff from say Drush, server side using community modules such as Migrate can help you completely avoid timeouts if need be.

EDIT

Also bear in mind that every page request is a full drupal bootstrap and Batch API picks off where it left off. That is one of the most expensive operations when using the Batch API, reloading drupal every N items. That's why people have been working on server side techniques to create nodes, import content, etc. Batch API is great for simple, repetitive tasks. But it tends to fall apart in either complex, or very very large datasets.


The batch API simply registers _batch_shutdown() as shutdown function with register_shutdown_function(). That function just saves in a database table the current state of the batch being executed.
The Batch API doesn't provide any guarantee that the operation you are executing is not interrupted in the middle. That is why batch operations normally execute simple operations like reading a database row from saving a table, and saving a database row in another table.