MongoDB / WiredTiger: reduce storage size after deleting properties from documents

Just to clarify, please be careful about using repairDatabase on a replica set node. repairDatabase is meant to be used to salvage readable data i.e. after a disk corruption, so it can remove unreadable data and let MongoDB start in the face of disk corruption.

If this node has an undetected disk corruption and you run repairDatabase on it, this could lead into that particular node having a different data content vs. the other node as a result of repairDatabase. Since MongoDB assumes all nodes in a replica set contains identical data, this could lead to crashes and hard to diagnose problems. Due to its nature, this issue could stay dormant for a long time, and suddenly manifest itself with a vengeance, seemingly without any apparent reason.

WiredTiger will eventually reuse the empty spaces with new data, and the periodic checkpointing that WiredTiger does could potentially release space to the OS without any intervention on your part.

If you really need to give space back to the OS, then an initial sync is the safest choice if you have a replica set. On a standalone, dump/restore will achieve the same result. Otherwise, compact is the safer choice vs. repairDatabase. Please backup your data before doing any of these, since in my opinion this would qualify as a major maintenance.


If you inspect the database using db.stats(), you will find dataSize and storageSize. storageSize may be bigger after you delete documents from the database. The use of db.repairDatabase() or compact command may not reduce storageSize. In that case, the more reliable way to reclaim disk space is to create a dump archive with mongodump, drop the database, and then restore it with mongorestore.

mongodump --gzip --archive=dump.gzip
mongo
> db.dropDatabase()
mongorestore --gzip --archive=dump.gzip

This solution will require downtime based on the size of the database.

FYI: MongoDB does not release disk space after you delete a document, instead, it will reuse that space for future documents, hence the storageSize being bigger than dataSize.