What's the best way to archive all but current year and partition the table at the same time

Why are you not getting minimal logging?

I have found the Data Loading Performance Guide, which you reference, to be an extremely valuable resource. However, it is also not 100% comprehensive, and I suspect that the grid is complex enough already that the author did not add a column Table Partitioning to break out differences in behavior depending on whether the table receiving the inserts is partitioned. As we will see later, the fact that table is already partitioned appears to inhibit minimal logging.

enter image description here

Recommended approach

Based upon the recommendations in the Data Loading Performance Guide (including the "Bulk Loading a Partitioned Table" section) as well as extensive experience loading partitioned tables with tens of billions of rows, here is the approach I would recommend:

  • Create a new database.
  • Create new tables partitioned by month in the new database.
  • Move the most recent year of data, in the following fashion:
    • For each month, create a new heap table;
    • Insert that month of data into the heap using the TABLOCK hint;
    • Add the clustered index to the heap containing that month of data;
    • Add the check constraint enforcing that the table contains just this month's data;
    • Switch the table into the corresponding partition of the new overall partitioned table.
  • Do a rename swap of the two databases.
  • Truncate the data in the now "archive" database.
  • Partition each of the tables in the "archive" database.
  • Use partition swaps to archive the data in the future.

The differences when compared to your original approach:

  • The methodology of moving the recent 12-13 months of data will be much more efficient if you load into a heap with TABLOCK one month at a time, using partition switching to place the data into the partitioned table.
  • A DELETE to clear away old table will be fully logged. Perhaps you can either TRUNCATE or drop the table and create a new archive table.

Comparison of approaches for moving the recent year of data

In order to compare approaches in a reasonable amount of time on my machine, I used a 100MM row test data set that I generated and that follows your schema.

As you can see from the results below, there is a large performance boost and reduction in log writes by loading data into a heap using the TABLOCK hint. There is an additional benefit if this is done one partition at a time. It's also worth noting that the one-partition-at-a-time method can easily be parallelized further if you run multiple partitions at once. Depending on your hardware, that might yield a nice boost; we typically load at least four partitions at once on server-class hardware.

enter image description here

Here is the full test script.

Final notes

All of these results depend on your hardware to some degree. However, my tests were conducted on a standard quad-core laptop with spinning disk drive. It's likely that the data loads should be much faster if you are using a decent server that does not have a lot of other load at the time you are conducting this process.

For example, I ran the recommended approach on an actual dev server (Dell R720) and saw a reduction to 76 seconds (from 156 seconds on my laptop). Interestingly, the original approach of inserting into a partitioned table did not experience the same improvement and still took just over 12 minutes on the dev server. Presumably this is because this pattern yields a serial execution plan, and a single processor on my laptop can match a single processor on the dev server.