Getting total term frequency throughout entire index (Elasticsearch)

I believe you need to turn term_statistics to true as per elasticsearch documentation:

Term statistics Setting term_statistics to true (default is false) will return

total term frequency (how often a term occurs in all documents)

document frequency (the number of documents containing the current term)

By default these values are not returned since term statistics can have a serious performance impact.


The reason for the difference in the count is because term vectors are not accurate unless the index in question has a single shard. For indexes with multiple shards, the documents are distributed all over the shards, hence the frequency returned isn't the total but from a randomly selected shard.

Thus, the returned frequency is just a relative measure and not the absolute value you expect. see the Behaviour section. To test this, you can create a single shard index and request the frequency (it should give you the actual total).