How to do a bulk update in Firestore

Here's how I did bulk update in Node.js (case: update all documents in users collection to add a new field country):

function updateCountry(countryName) {

  let lastUserUpdated = null;
  let updatedUsersCount = 0;

  const usersRef = firestore.collection("users");
  const batchSize = 500;

  while (true) {

    const batch = firestore.batch();
    const usersListRef = usersRef
      .orderBy("userId") // To start after a particular user, there should be some order, hence orderBy is necessary
      .startAfter(lastUserUpdated)
      .limit(batchSize);
    const usersList = await usersListRef.get();

    if (usersList.size === 0) { // When all the users have been traversed, we can exit from the loop
      break;
    }
    usersList.forEach((user) => {
      lastUserUpdated = user;
      batch.update(user.ref, { country: countryName }); // Here, the update will be added to the batch, but won't be executed now
      updatedUsersCount++;
    });
    await batch.commit(); // At this point, everything added to the batch will be executed together
    console.log(`Updated ${updatedUsersCount} users.`);
  }
}

in SQL you can do

UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;

The WHERE is optional - so you can set all fields to a specific value or all fields for a selection of rows but you don't need to first obtain a reference to the rows.

For example if content needs review you could set a flag in a review column for all rows.

In firestore I don't think there is a way to write to a collection.

https://firebase.google.com/docs/reference/js/firebase.firestore.CollectionReference

All the collection methods are ways to add a document, get a doc reference - or filter a search for documents.

So as far as I can see there is no way to update a set of documents in firestore without first obtaining references to them.

Batch writes speeds this up as you can update 500 documents at a time.


I was looking for a solution, found none, so I made this one, if anyone's interested.

public boolean bulkUpdate() {
  try {
    // see https://firebase.google.com/docs/firestore/quotas#writes_and_transactions
    int writeBatchLimit = 500;
    int totalUpdates = 0;

    while (totalUpdates % writeBatchLimit == 0) {
      WriteBatch writeBatch = this.firestoreDB.batch();

      List<QueryDocumentSnapshot> documentsInBatch =
          this.firestoreDB.collection("animals")
              .whereEqualTo("species", "cat")
              .limit(writeBatchLimit)
              .get()
              .get()
              .getDocuments();

      if (documentsInBatch.isEmpty()) {
        break;
      }

      documentsInBatch.forEach(
          document -> writeBatch.update(document.getReference(), "hasTail", true));

      writeBatch.commit().get();

      totalUpdates += documentsInBatch.size();
    }

    System.out.println("Number of updates: " + totalUpdates);

  } catch (Exception e) {
    return false;
  }
  return true;
}

If you have used Firebase database, writing to completely single separate locations atomically was not possible, that's why you would have to use batch writes, which means that either all of the operations succeed, or none of them are applied.

Regarding Firestore, all operations are now atomically processed. However, you can execute multiple write operations as a single batch that contains any combination of set(), update(), or delete() operations. A batch of writes completes atomically and can write to multiple documents.

This a simple example regarding a batch operation for write, update and delete operation.

WriteBatch batch = db.batch();

DocumentReference johnRef = db.collection("users").document("John");
batch.set(johnRef, new User());

DocumentReference maryRef = db.collection("users").document("Mary");
batch.update(maryRef, "Anna", 20); //Update name and age

DocumentReference alexRef = db.collection("users").document("Alex");
batch.delete(alexRef);

batch.commit().addOnCompleteListener(new OnCompleteListener<Void>() {
    @Override
    public void onComplete(@NonNull Task<Void> task) {
        // ...
    }
});

Calling commit() method on the batch object means that you commit the entire batch.