How are hashing algorithms useful if the implementation is public?

Being public is exact the point: you show everyone how it's done and how difficult it is to reverse it. It's like showing you a ginormous jigsaw puzzle with a trillion pieces, but with every piece in its place, and shuffling everything down. You know all the pieces form the puzzle (you just saw it), and you know it's very, very difficult to put everything back. A public hash shows you how it's done (the result) and how difficult is to do everything in reverse.

A public hash function is just a set of mathematical operations. Anyone can (but only a few will) do the operations by hand and prove that the algorithm works as expected. Anyone can reverse it too, but it takes so much time (trillions of years with all computing power of our planet combined) that the most cost-effective way to reverse it is a bruteforce.

Unless it's a pretty basic insecure hash function.


Probably not the answer you're looking for, but consider this.

Take a 10-digit number, something like 3,481,031,813, and then now with only pen and paper find it's square (i.e. multiply it by itself). While tedious, this is relatively straightforward and can be accomplished after some time.

Now with the same pen and paper, try calculating the square root of a 20-digit number. This is a much much harder task -- even though it's effectively the reverse of the first task.

Mathematical functions can be made, so that inverse function is much harder to solve. One way hashes take this to their logical conclusion -- the function is so hard to solve as to be rendered practically unsolvable.

Add to that, the fact that information is lost along the way. The square of 2 is 4, but the square root of 4 is both +2 and -2. Information was lost during the square function, as to what the sign of the original number was. Hash functions effectively do this as well, information is lost when you take a 10GB file and shrink it down to a 256-bit hash, there is simply no way to reconstruct the original message anymore.


I don't think I will be able to give an answer that fully satisfies you, but the short answer is that for something to be called a "cryptographic hash function" it has to be a complex enough function that this kind of reverse engineering is not easy. That's not to say that it is impossible, but as soon as someone makes even a little bit of progress reverse-engineering a cryptographic hash function, we will consider it to be broken, and move to something stronger. You can read more about the properties of cryptographic hash functions here (wikipedia).

As an example let's look at SHA-1, the properties of a cryptographic hash function are:

  • Pre-image resistance
  • Second pre-image resistance
  • Collision resistance

In 2005 an attack was invented that can find collisions in about 260 operations. That's still millions of $ USD to perform that attack, and as far as I know there are still no attacks on the other two cryptographic properties (pre-image and second pre-image), but that is enough for us to consider SHA-1 completely broken.

Tags:

Hash