Transferring large amount (84 million rows) of data efficiently

I would add that, however you decide to approach this, you'll need to batch these transactions. I've had very good luck with the linked article lately, and I appreciate the way it takes advantage of indexes as opposed to most batched solutions I see.

Even minimally logged, those are big transactions, and you could be spend a lot of time dealing with the ramifications of abnormal log growth (VLFs, truncating, right-sizing, etc.).

Thanks


"Efficient" could apply to log file usage, I/O performance, CPU time or execution time.

I would try to achieve a minimally logged operation, which would be fairly efficient from a logging perspective. This should save you some execution timeas a bonus. If you have the tempdb space, the following might work for you.

CREATE TABLE #temp;
ALTER source -> BULK_LOGGED recovery model

BEGIN TRANSACTION;

    INSERT INTO dest SELECT FROM source;
    INSERT INTO #temp SELECT FROM source WHERE keep_condition=1;
    TRUNCATE TABLE source;
    INSERT INTO source SELECT FROM #temp;

COMMIT TRANSACTION;

ALTER source -> FULL recovery model
DROP TABLE #temp;

For a minimally logged operation to happen, a number of conditions have to be true, including no backups currently running, database set to BULK_LOGGED recovery mode, and depending on your indexes, the target table may have to be empty. Some of this behaviour also changed (improved) from SQL Server 2005 to 2008.

Then again, without knowing the specifics of your table and data, any of your other options may well perform better. Try using

SET STATISTICS IO ON;
SET STATISTICS TIME ON;

.. and see which works best.

EDIT: When performing bulk-logged operations, make sure you make a backup (full or transaction log) before and after the operation if you need point-in-time restore capability and you suspect that other activity may be going on in the database at the same time that your ETL job is running.

I wrote a blog post on minimally logged operations a while ago, there are links in there to other posts and documentation.

Tags:

Sql Server