What is a hash function in java?

The Wikipedia article will have a lot of technical information, but a simplistic view of hashing is something like the following.

Imagine that there's a magical function that can give a number to any object. Given the same object, it always return the same number.

Immediately now you have a quick way to test if two objects are the same: ask this function for their numbers and compare. If they're different, then they're not the same.

But what if they have the same number? Could two different objects have the same number?

Yes, this is possible in most scenario. Let's say that the function can only give numbers between 1..10, for example, and there are 100 different objects. Then of course some different objects must have the same number. This is what is called a "collision". A "collision" makes our quick equality test not as useful, so as much as possible we want to minimize its happening. A good magical function is one that would try to minimize the number of "collision".

So what else can you do with this number? Well, you can use it to index an array. Given an object, you can put it at the index given by the number from this magical function. This array is essentially what a hashtable is; this magical function is a hash function.


A hash function is a way to create a compact representation of an arbitrarily large amount of data. In java with the hashcode method this means somehow describing the state of your object (no matter how large) in an int (4 bytes). And is usually written to be a fairly fast as explained below.

To simplify in hashtables/hashmaps the hashcode serves as sort of a cheap equals. Take two objects a and b of type Foo lets says to figure out if a.equals(b) takes 500 ms where as calculating a (efficient) hashcode only take 10ms. So if we want to know if a.equals(b) instead of doing that directly first we will look at the hashcodes and ask does a.hashCode() == b.hashCode(). Note that this will take only 20ms in our example.

Because of the API definition of hashcode we know that if the hashcode of a is not equal to b then a.equals(b) should never be true. So in our above test if we see the hashcodes are unequal then we never need to do the longer .equals() test, this is why you should always override hashCode and equals together.

You may also see references about writing "good" or "well distributed" hashcodes. This has to do with the fact that the inverse of the previous statements about hashcode and equals is not true. More specifically a.hashCode() == b.hashCode() does not necessarily imply a.equals(b) So the idea of a good hashcode is you reduce the likelyhood of a.hashCode() == b.hashCode() when a.equals(b) is false. You may have seen this referred to as a collision of a hash function.

Back to hashmaps/tables. These are based on key/value pairs. So when you add or retrieve a value you will supply a key. So the first thing the map has to do is look for the key, which means finding something that .equals() the key you provide. But as we discussed above .equals() can be incredibly slow which means comparisons can be greatly sped up by checking hashcodes first. Since when the hashcodes are well distributed you should know quickly when x is definitely != y.

Now in addition to the comparison hashmaps/tables actually use the hashcodes to organize their internal storage of the data, however I think that is beyond the scope of what you are looking to understand at this point.


HASH FUNCTION:- A hash function takes a group of characters (called a key) and maps it to a value of a certain length (called a hash value or hash). The hash value is representative of the original string of characters, but is normally smaller than the original. Hashing is done for indexing and locating items in databases because it is easier to find the shorter hash value than the longer string. Hashing is also used in encryption.This term is also known as a hashing algorithm or message digest function.

HASH MAP:- HashMap is a collection class that is designed to store elements as key-value pairs. Maps provide a way of looking up one thing based on the value of another.

enter image description here

A lookup table that is designed to efficiently store non-contiguous keys (account numbers, part numbers, etc.) that may have wide gaps in their alphabetic or numeric sequences.

HASH TABLE:- Hash tables are created with an algorithm that stores the keys into hash buckets, which contain key-value pairs. Since different keys may hash to the same bucket, the goal of hash table design is to spread out the key-value pairs evenly with each bucket containing as few key-value pairs as possible. When an item is looked up, its key is hashed to find the appropriate bucket, and the bucket is then compared to find the right key-value pair.

enter image description here