Is there any way to cryptographically hash a human thumbprint?

As @Xander points out, a very similar question has been asked yesterday. Indeed:

  • If you can derive a key from a fingerprint, then you can hash that key and get a hash value.
  • If you can hash a fingerprint, you can use the hash value as a key.

So they really are the same question. And the answer is: people are working on it, it does not work well yet, but might improve over years.


I would like, though, to point out something important: a "something you know" has any value for authentication only because it is also "something that the attacker does not know". It is the secrecy which confers the power.

A fingerprint, like other biometric measures, does not really work on secrecy (although many systems try to use it that way). The important characteristic of a fingerprint is that it is attached to a human: when a human being uses his own finger on a reader, he cannot help but using his own fingerprint. Indeed, that's where the innovation is in modern fingerprint readers: in the systems which try to ensure that what they are detecting is really a human finger still attached to its nominal owner's body.

Secrecy is not a big part of fingerprints for security. Your fingerprints are not secret: you leave them everywhere, on your car, on every door handle that you go through, on the elevator buttons, on every glass that you use in a bar... If a fingerprint can be turned into a key (or hash value, regardless of how you want to see it), then that key can be rebuilt offline, from any copy of one of these prints. There is very little secret here.

To sum up: even if you could reliably turn a fingerprint into a key, it would not be a good idea to use it as a secret key. It would be useful as an indexing key, though: not for security, but for performance.


Consider that a cryptographic hash algorithm excels at producing different digest values for even the slightest differences in inputs. Even a 1 bit change in the input causes a cascade of changes yielding a completely different hash value. Pre-image resistance is a necessary characteristic for a cryptographic hash algorithm.

Now look at fingerprints. One problem with fingerprints is that the relationships between identifying marks is not guaranteed to be constant between readings. Your finger might be slightly swollen due to varying levels of fluids in your body, or aligned ever so slightly different between sensor elements, or even have a piece of dirt on it, and that could be just enough difference to cause a element's worth of difference between readings. Remember, even one bit of change will mean a completely different hash is output. So a precise image reading or snapshot of a print can't be directly hashed.

However, the image can be processed. Every print has a set of "landmarks", which are specifically identifiable points. Bifurcations are where two ridges join together, a rod is where a ridge terminates, an island is a short little ridge, and so on. These landmarks can be identified, and can be measured in relationship to each other. If you were to lay a thumbprint out on a grid, for example, you could identify each cell with the landmarks it contains.

The problem then becomes aligning the grid. If the grid isn't identically laid out each time a print is read, you would not generate the same hash.

Prints come in only three basic shapes: arches, loops, and whorls. It seems like it could be possible to use the defining characteristics of all arch-type prints (for example) to produce three needed reference points, and thus align the grid. You then process it and identify all possible landmarks. But then what? What assurance do you have that every landmark has landed in the same cell every time? If you try to establish a fuzzy zone around the gridlines, how do you know which landmarks are just barely falling into (or out of) the fuzzy zone?

(The same concern holds true if you try to use radials from the center of the reference points - how much tolerance do you build into the vectors?

The bottom line is you will likely find it hard to get the exact same value out of the hash every single time, because the prints are never precisely lined up in a repeatable fashion.

So how could you possibly use hashes to keep the prints secure? When the user initially registers their print, you use the same grid-based scheme to analyze it, and produce a hash. You then analyze the landmarks falling in the potentially fuzzy zones, and compute a distinct hash for each possible permutation. You'll quickly build a large set of hashes that all represent the potential values of one user's print. Later, when a user's print is read and hashed, you look it up in the full set of hashes on file, and identify the user.