HashMap resize method implementation detail

The design consideration has been documented within the same source file, in a code comment in line 211

* When bin lists are treeified, split, or untreeified, we keep 
* them in the same relative access/traversal order (i.e., field 
* Node.next) to better preserve locality, and to slightly 
* simplify handling of splits and traversals that invoke 
* iterator.remove. When using comparators on insertion, to keep a 
* total ordering (or as close as is required here) across 
* rebalancings, we compare classes and identityHashCodes as 
* tie-breakers. 

Since removing mappings via an iterator can’t trigger a resize, the reasons to retain the order specifically in resize are “to better preserve locality, and to slightly simplify handling of splits”, as well as being consistent regarding the policy.


There are two common reasons for maintaining order in bins implemented as a linked list:

One is that you maintain order by increasing (or decreasing) hash-value. That means when searching a bin you can stop as soon as the current item is greater (or less, as applicable) than the hash being searched for.

Another approach involves moving entries to the front (or nearer the front) of the bucket when accessed or just adding them to the front. That suits situations where the probability of an element being accessed is high if it has just been accessed.

I've looked at the source for JDK-8 and it appears to be (at least for the most part) doing the later passive version of the later (add to front):

http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/util/HashMap.java

While it's true that you should never rely on iteration order from containers that don't guarantee it, that doesn't mean that it can't be exploited for performance if it's structural. Also notice that the implementation of a class is in a privilege position to exploit details of its implementation in a formal way that a user of that class should not.

If you look at the source and understand how its implemented and exploit it, you're taking a risk. If the implementer does it, that's a different matter!

Note: I have an implementation of an algorithm that relies heavily on a hash-table called Hashlife. That uses this model, have a hash-table that's a power of two because (a) you can get the entry by bit-masking (& mask) rather than a division and (b) rehashing is simplified because you only every 'unzip' hash-bins.

Benchmarking shows that algorithm gaining around 20% by actively moving patterns to the front of their bin when accessed.

The algorithm pretty much exploits repeating structures in cellular automata, which are common so if you've seen a pattern the chances of seeing it again are high.


Order in a Map is really bad [...]

It's not bad, it's (in academic terminology) whatever. What Stuart Marks wrote at the first link you posted:

[...] preserve flexibility for future implementation changes [...]

Which means (as I understand it) that now the implementation happens to keep the order, but in the future if a better implementation is found, it will be used either it keeps the order or not.