postgres bloom index

When bloom was introduced in 9.6, parallel query had just been introduced and is was turned off by default. Bloom appeared to be better than a non-parallel sequential scan in the example given. But when you can do a parallel seq scan, it appears better than using a bloom index does. It is not actually better as can be verified by turning off parallel query with set max_parallel_workers_per_gather TO 0 and looking at the actual executions speeds, but the planner thinks the parallel seq scan will be better. It looks like maybe the cost estimation part of bloom could use some work.

The example code was not updated for when parallel query was turned on by default, in v10, so it no longer works as advertised.

Note that your example never achieved any index usage at all, so you can't really draw any conclusions about which index is better for that scenario.


Update

This seems to be a bug with estimates.

Bloom is using internally genericcostestimates. That's defeated if the seqscan goes parallel.

Old attempt at answer

You're not even using the index you're creating (bloom or btree)

->  Parallel Seq Scan on tbloom  (cost=0.00..126195.00 rows=1 width=24) (actual time=256.224..256.224 rows=0 loops=3)

That's showing that you're scanning the entire table with parallel workers. Your indexing is totally irrelevant neither index is used (thus the <1% difference). Did you ANALYZE the table after you created the index? If so, try

set enable_seqscan = 0;

And run the EXPLAIN ANALYZE for the query again. I would expect the bloom index speed things up by massively reducing the size of the table you have to visit.


Answer left in comments by a-horse-with-no-name

It seems to depend on the value for random_page_cost. On my laptop where I have a SSD, this is set to 1 and in that case the bloom index is used. On a server that has random_page_cost higher than 1, the seq scan is used:

https://explain.depesz.com/s/6Ynx

If you lower it to 1 (at least on Postgres 11) the bloom index is more efficient to the optimizer and thus it chooses the index scan:

screenshot

However, setting that value to 1 only makes sense on SSDs, it's not a good idea for spinning hard disks.

Analyzing the table does not change things (tested in Postgres 11). It seems the bloom costing could do with some adjustments.