In Cloud Firestore, why is it not possible to "single bulk" delete a collection (as can be done with Realtime Database)?

The RTDB is able to do this because each database is local to a single region. In order to provide a serialized view, when you call remove(), the database stops all other work until the removal is complete.

This behavior has been the cause of several apparent outages: if a remove() call has to delete huge swaths of data, all other activity is effectively locked out until it completes. As a result even for RTDB users that want to delete large quantities of data we have recommended recursively finding and deleting documents in groups (CLI, node.js).

Firestore on the other hand is based on more traditional Google-style storage infrastructure where different ranges of keys are assigned dynamically to different servers (storage isn't actually backed by BigTable, but the same principles apply). This means that deleting data is no longer a necessarily a single region action and it becomes very expensive to effectively make the deletion appear transactional. Firestore transactions are currently limited to 100 participants and this means that any non-trivial transactional bulk deletion is impossible.

We're investigating how best to surface an API that does a bulk deletion without promising transactional behavior. It's straightforward to imagine how to do this from a mobile client, but as you've observed this wouldn't be efficient if all we did is embedded the loop and batch delete for you. We also don't want to make REST clients second-class citizens either.

Firestore is a new product and there are ton of things still to do. Unfortunately this just hasn't made the cut. While this is something we hope to address eventually I can't provide any timeline on when that would be.

In the meantime the console and the firebase command-line both provide a non-transactional means of doing this, e.g. for test automation.

Thanks for your understanding and thanks for trying Firestore!


I was happily refactoring my app for Firestore from Realtime Database, enjoying the shorter code and simpler syntax, until I refactored the delete() functions! To delete a document with subcollections:

  • Create an array of promises.
  • get() a subcollection, that doesn't have further subcollections.
  • Iterate through a forEach() function to read each document in the subcollection.
  • Delete each document, and push the delete command into the array of promises.
  • Go on to the next subcollection and repeat this.
  • Use Promise.all(arrayOfPromises) to wait until all the subcollections have been deleted.
  • Then delete the top-level document.

With multi layers of collections and documents you'll want to make that a function, then call it from another function to get the next higher layer, etc.

You can see this in the console. To manually delete collections and documents, delete the right-most document, then delete the right-most collection, and so on working left.

Here's my code, in AngularJS. It only works if the top-level collection wasn't deleted before the subcollections.

$scope.deleteClip = function(docId) {
if (docId === undefined) {
docId = $scope.movieOrTvShow + '_' + $scope.clipInMovieModel;
}
$scope.languageVideos = longLanguageFactory.toController($scope.language) + 'Videos';
var promises = [];
firebase.firestore().collection($scope.languageVideos).doc($scope.movieOrTvShow).collection('Video Clips').doc(docId).collection('SentenceTranslations').get()
.then(function(translations) {
  translations.forEach(function(doc) {
    console.log(doc.id);
    promises.push(firebase.firestore().collection($scope.languageVideos).doc($scope.movieOrTvShow).collection('Video Clips').doc(docId).collection('SentenceTranslations').doc(doc.id).delete());
  });
});
firebase.firestore().collection($scope.languageVideos).doc($scope.movieOrTvShow).collection('Video Clips').doc(docId).collection('SentenceExplanations').get()
.then(function(explanations) {
  explanations.forEach(function(doc) {
    console.log(doc.id);
    promises.push(firebase.firestore().collection($scope.languageVideos).doc($scope.movieOrTvShow).collection('Video Clips').doc(docId).collection('SentenceExplanations').doc(doc.id).delete());
  });
});
Promise.all(promises).then(function() {
  console.log("All subcollections deleted.");
  firebase.firestore().collection($scope.languageVideos).doc($scope.movieOrTvShow).collection('Video Clips').doc(docId).delete()
  .then(function() {
    console.log("Collection deleted.");
    $scope.clipInMovieModel = null;
    $scope.$apply();
  })
  .catch(function(error) {
    console.log("Remove failed: " + error.message);
  });
})
.catch(function(error){
  console.log("Error deleting subcollections: " + error);
});
};

All that would have been one line in Realtime Database.


This is the fastest way to delete all documents in a collection: mix between python delete collection loop and python batch method

def delete_collection(coll_ref, batch_size, counter):
    batch = db.batch()
    init_counter=counter
    docs = coll_ref.limit(500).get()
    deleted = 0

    for doc in docs:
        batch.delete(doc.reference)
        deleted = deleted + 1

    if deleted >= batch_size:
        new_counter= init_counter + deleted
        batch.commit()
        print("potentially deleted: " + str(new_counter))
        return delete_collection(coll_ref, batch_size, new_counter)
    batch.commit()

delete_collection(db.collection(u'productsNew'), 500, 0)

this delete all documents from collection "productNew" in blocks of 500, which is currently the maximum number of documents which can be passed to a commit. See Firebase write and transaction quotas.

You can get more sophisticated and handle also API errors, but this just works fine for me.