How accurate is MongoDB's estimated count query?

Comparing the two, to me it's very difficult to conjure up a scenario in which you'd want to use countDocuments() when estimatedDocumentCount() was an option.

That is, the equivalent form of estimatedDocumentCount() is countDocuments({}), i.e., an empty query filter. The cost of the first function is O(1); the second is O(N), and if N is very large, the cost will be prohibitive.

Both return a count, which, in a scenario in which Mongo has been deployed, is likely to be quite ephemeral, i.e., it's inaccurate the moment you have it, as the collection changes.

Please review the MongoDB documentation for estimatedDocumentCount(). Specifically, they note that "After an unclean shutdown of a mongod using the Wired Tiger storage engine, count statistics reported by db.collection.estimatedDocumentCount() may be inaccurate." This is due to metadata being used for the count and checkpoint drift, which will typically be resolved after 60 seconds or so.

In contrast, the MongoDB documentation for countDocuments() states that this method is a wrapper that performs a $group aggregation stage to $sum the results set, ensuring absolute accuracy of the count.

Thus, if absolute accuracy is essential, use countDocuments(). If all you need is a rough estimate, use estimatedDocumentCount(). The names are accurate and should be used accordingly.

How accurate is MongoDB's estimated count query?

Tags:

Mongodb

Related

Recent Posts