Looping through 16 million records using ArcPy?

You could write the Objectid and result of the calculation (cate_2) to a csv file. Then join the csv to your original file, populate a field, to preserve the result. This way you aren't updating the table using DA cursor. You could use a Search cursor.


Apologies, if I keep reviving this old thread. The idea was to perform the if-else statements on the combine raster and then use the new field in Lookup to create a new raster. I complicated the problem by exporting the data as a table and introduced inefficient workflow which was addressed by @Alex Tereshenkov. After realizing the obvious, I batched the data into 17 queries (1 million each) as it was suggested by @FelixIP. It took each batch on average ~1.5 minutes to complete and total run time was ~23.3 minutes. This method eliminates need for joins and I think this method best accomplishes the task. Here's a revised script for future reference:

import arcpy, time
from arcpy import env

def cursor():
    combine = "D:/mosaic.gdb/combine_2013"
    #arcpy.AddField_management(combine,"cat_1","SHORT")
    fields = ['wat_agg', 'dev_agg', 'herb_agg','forest_agg', 'cat_1']
    batch = ['"OBJECTID" >= 1 AND "OBJECTID" <= 1000000', '"OBJECTID" >= 1000001 AND "OBJECTID" <= 2000000', '"OBJECTID" >= 2000001 AND "OBJECTID" <= 3000000', '"OBJECTID" >= 3000001 AND "OBJECTID" <= 4000000', '"OBJECTID" >= 4000001 AND "OBJECTID" <= 5000000', '"OBJECTID" >= 5000001 AND "OBJECTID" <= 6000000', '"OBJECTID" >= 6000001 AND "OBJECTID" <= 7000000', '"OBJECTID" >= 7000001 AND "OBJECTID" <= 8000000', '"OBJECTID" >= 8000001 AND "OBJECTID" <= 9000000', '"OBJECTID" >= 9000001 AND "OBJECTID" <= 10000000', '"OBJECTID" >= 10000001 AND "OBJECTID" <= 11000000', '"OBJECTID" >= 11000001 AND "OBJECTID" <= 12000000', '"OBJECTID" >= 12000001 AND "OBJECTID" <= 13000000', '"OBJECTID" >= 13000001 AND "OBJECTID" <= 14000000', '"OBJECTID" >= 14000001 AND "OBJECTID" <= 15000000', '"OBJECTID" >= 15000001 AND "OBJECTID" <= 16000000', '"OBJECTID" >= 16000001 AND "OBJECTID" <= 16757856']
    for i in batch:
        start_time = time.time()
        with arcpy.da.UpdateCursor(combine, fields, i) as cursor:
            for row in cursor:
            # row's 0,1,2,3,4,5 = water, dev, herb, forest, category
            #classficiation water = 1; dev = 2; herb = 3; ; forest = 4
                if (row[0] >= 0 and row[0] >= row[3]):
                    row[4] = 1
                elif (row[1] > 1):
                    row[4] = 2
                elif (row[2] > 180):
                    row[4] = 3
                elif (row[3] >= 0 and row[3] > row[0]):
                    row[4] = 4
                cursor.updateRow(row)
        end_time =  time.time() - start_time
        print str(end_time) + " - Seconds"

cursor()