Importing large SQL dump with millions of INSERT statements

This dump was dumped as individual statements (with pg_dump --inserts)

INSERT INTO esa2010_codes VALUES (11002, 'Národn
INSERT INTO esa2010_codes VALUES (11003, 'Nefina
INSERT INTO esa2010_codes VALUES (12502, 'Národn
INSERT INTO esa2010_codes VALUES (11001, 'Verejn
INSERT INTO esa2010_codes VALUES (12602, 'Národn
INSERT INTO esa2010_codes VALUES (12603, 'Finanč
INSERT INTO esa2010_codes VALUES (12503, 'Ostatn

This is documented as being slow (from man pg_dump)

--inserts Dump data as INSERT commands (rather than COPY). This will make restoration very slow; it is mainly useful for making dumps that can be loaded into non-PostgreSQL databases. However, since this option generates a separate command for each row, an error in reloading a row causes only that row to be lost rather than the entire table contents. Note that the restore might fail altogether if you have rearranged column order. The --column-inserts option is safe against column order changes, though even slower.

That's why it's so slow. What you're going to want to do is to turn off some of the durability settings, specifically synchronous_commit, though fsync will help too

You can do this very simply by running the following command before you run your \i file.sql.

SET synchronous_commit TO off;

That will do a lot to speed it up. Don't forget to turn back on the durability options after you're done. I bet it'll finish in a few hours, after you're set that. If you need more speed though don't hesitate to turn off fsync and full_page_writes on the cluster until you get the data up -- though I won't do it if the DB had data you needed in it, or if it was production. As a last note, if you need the speed and this is a production DB you can go all out on your own copy and dump it with the default options by pg_dump, which you'll be able to load much faster.


Another option is running import in one transaction (if it is possible):

BEGIN;
\i dump.sql
COMMIT;

PostgreSQL is runnning in autocommit mode by default - it means every command is finished by commit - and commit is finished by fsync (and fsync is pretty slow). It can be reduced by asunchronnous commit (Evan Carroll's reply) or reduced to one by explicit transaction.

Other possibility is disabling check of referential integrity (if it is used). This variant is possible, because we can expect so dump is consistent and correct. You can see details to command ALTER TABLE xx DISABLE TRIGGER ALL.

The source of your file is pg_dump. The most simply speedup can be taken by using some option when dump is created.

  1. Don't use option --inserts. Copy format is significantly faster for restore

  2. Use option --disable-triggers to disable RI check (expect correct data)

  3. You can use custom format -F option. Then you can use pg_restore for restoring and building indexes (most slow operation) parallel.