SQLAlchemy update multiple rows in one transaction

As an extension to @Martijn's excellent answer i've done it in a way that allows you to increment rather than set the value:

# List of items to increment [id, increment value]
increments = [
    [1,1],
    [2,5],
    [3,8]
]
# Extract ids
ids = [v[0] for v in increments]
# Create payload
payload = {
    v[0]:Table.col2 + v[1] for v in increments
}
# Get the product
query(Table) \
    .filter(Products.col1.in_(ids)) \
    .update({
        Table.col2: case(
            payload, 
            value=Table.col1
        )
    })

Yes, updating a larger number of rows with a single bulk UPDATE statement will be a lot faster than using individual UPDATEs on each and every object. An IN filter would only help you limit what rows are updated, but you still need to tell the database what value to use for the col2 updates.

You can use a CASE ... WHEN ... THEN construct for that, with the case() function:

from sqlalchemy.sql import case

query(MyTable).filter(
    MyTable.col1.in_(payload)
).update({
    MyTable.col2: case(
        payload,
        value=MyTable.col1,
    )
}, synchronize_session=False)

The above a) selects rows where the col1 value is a key in the payload dictionary, then b) updates the col2 column value using a CASE statement that picks values from that same dictionary to update that column based on matching col1 against the keys.

With payload set to {'x': 'y', 'a': 'b', 'c': 'd'}, the above executes the following query (give or take the exact order of WHEN clauses and values in the IN test):

UPDATE mytable
SET
    col2=CASE mytable.col1
        WHEN 'x' THEN 'y'
        WHEN 'a' THEN 'b'
        WHEN 'c' THEN 'd'
    END
WHERE
    mytable.col1 IN ('x', 'a', 'c')

I set synchronize_session to False there, as updating all possible cached MyTable instances at once is perhaps not the best idea when updating a large number of rows. Your other options are 'evaluate' and 'fetch'.

  • We can't use the default 'evaluate' (which would find existing objects in the session that match the where clause, to update in-place), because SQLAlchemy currently doesn't know how to process an IN filter (you get an UnevaluatableError exception).

  • If you do use 'fetch' then all instances of MyTable cached in the session that were affected are updated with new values for col2 (as mapped by their primary key).

Note that a commit would expire the session anyway, so you'd only want to use 'fetch' if you need to do some more work with the updated rows before you can commit the current transaction.

See the Query.update() documentation for more information on what synchronize_session options you have.


Another solution that I found before the chosen answer which also works fast would be:

# payload = {'x': 'y', 'a': 'b', 'c': 'd'}
all_rows = query(MyTable).filter(
    MyTable.col1.in_(payload)
)
for row in all_rows:
    row.col2=payload[row.col1]

This would do aSELECT though, which will add a couple of seconds, but is helpful to someone who finds it more convenient having the full row, also giving more flexibility.