How is it possible for Hash Index not to be faster than Btree for equality lookups?

Disk based Btree indexes truly are O(log N), but that is pretty much irrelevant for disk arrays that fit in this solar system. Due to caching, they are mostly O(1) with a very large constant plus O((log N)-1) with a small constant. Formally, that is the same thing as O(log N), because constants don't matter in big O notation. But they do matter in reality.

Much of the slow down in hash index lookups came from the need to protect against corruption or deadlocks caused by hash-table resizing concurrent with the lookups. Until recent versions (every version you mention is comically out of date), this need led to even higher constants and to rather poor concurrency. Vastly more man hours went into the optimization of BTree concurrency than hash concurrency.


Hash lookup is theoretically an O(1) operation when the key hash maps directly to the physical location of the target record. The way it works in Postgres, if I understand it correctly, is a bit more complicated: the key hash maps to a bucket that contains the OID you're looking for. A bucket can potentially comprise more than one page, which you need to sequentially scan until you find your particular key (hash). This is why it appears slower than you expect.

The hash index access method README file in the source code repo has all the details.


The why? issue is already addressed by other answers, but I question whether the premise is still correct.

Things have moved on in Postgres since 9.6. Hash indexes are now first-class citizens (as they are WAL logged and thus safe to use). And I measured a 40% performance increase over btree in a simple test (unique integers) on Postgres 11.

Like all benchmarks this should be treated with extreme caution. But it's no longer the case that hash indexes are never faster than btree.

The benchmark retrieves 1e7 rows out of 1e8 in a large synthetic table, and consists of the following SQL statements:

create table hash_test as 
select * from generate_series(1e10, 1e10+1e8) as id;

create index idx_btree on hash_test using btree (id); -- 2.5 minutes
create index idx_hash on hash_test using hash (id); -- 4 minutes
analyze hash_test;

-- enable one index (e.g. idx_hash) and disable the other:
update pg_index set indisvalid = (indexrelid = 'idx_hash'::regclass)
where indexrelid in ('idx_btree'::regclass, 'idx_hash'::regclass);

-- fetch and sum 1e7 randomly selected rows:
with ids_to_fetch as (
    select 1e10 + (random() * 1e8)::int as id 
    from generate_series(1, 1e7) as num_rows_to_fetch
)
select sum(id) from hash_test
natural inner join ids_to_fetch

-- only idx_btree enabled: 72, 72, 72 seconds (3 runs)
-- only idx_hash enabled: 42, 43, 43 seconds (3 runs)