Why are partial PostgreSQL HASH indices not smaller than full indices?

I would argue that this is a bug in the hash index code. When you create an index on an already-populated table, it tries to pre-size the index to hold all the data so that it doesn't have to keep splitting buckets as the index is created. But the code for doing this does not take the NULL fraction of the column nor (apparently) the selectivity of the partial index clause into account, so it arrives at a too-large number for the pre-sizing.

If you were to create the index first, and then populated the table, you will find that the hash index is small, whether you made it partial or not. If the table is going to grow substantially after the index is created, the extra space consumed by the index upon original creation will be put to good use.


It's not explicitly stated in the documentation, but in the source code there is the following comment:

/*
 * We do not insert null values into hash indexes.  This is okay because
 * the only supported search operator is '=', and we assume it is strict.
 */

So the is not null predicate does indeed change nothing, as null values are always ignored for hash indexes (which does make sense, as comparing null values with = would never return true).