How to determine what type of encoding/encryption has been used?

Your example string (WeJcFMQ/8+8QJ/w0hHh+0g==) is Base64 encoding for a sequence of 16 bytes, which do not look like meaningful ASCII or UTF-8. If this is a value stored for password verification (i.e. not really an "encrypted" password, rather a "hashed" password) then this is probably the result of a hash function computed over the password; the one classical hash function with a 128-bit output is MD5. But it could be about anything.

The "normal" way to know that is to look at the application code. Application code is incarnated in a tangible, fat way (executable files on a server, source code somewhere...) which is not, and cannot be, as much protected as a secret key can. So reverse engineering is the "way to go".

Barring reverse engineering, you can make a few experiments to try to make educated guesses:

  • If the same user "changes" his password but reuses the same, does the stored value changes ? If yes, then part of the value is probably a randomized "salt" or IV (assuming symmetric encryption).
  • Assuming that the value is deterministic from the password for a given user, if two users choose the same password, does it result in the same stored value ? If no, then the user name is probably part of the computation. You may want to try to compute MD5("username:password") or other similar variants, to see if you get a match.
  • Is the password length limited ? Namely, if you set a 40-character password and cannot successfully authenticate by typing only the first 39 characters, then this means that all characters are important, and this implies that this really is password hashing, not encryption (the stored value is used to verify a password, but the password cannot be recovered from the stored value alone).

Edit: I just noticed a very cool script named hashID. The name pretty much describes it.

~~~

Generally speaking, using experience to make educated guesses is how these things are done.

Here is a list with a very big number of hash outputs so that you know how each one looks and create signatures/patters or just optically verify.

  • Online Hash Crack Hashes Generator
  • InsidePro Software Forum > Hash Types (via Archive.org)

There are two main things you first pay attention to:

  • the length of the hash (each hash function has a specific output length)
  • the alphabet used (are all english letters? numbers 0-9 and A-F so hex? what special characters are there if any?)

Several password cracking programs (John the ripper for example) apply some pattern matching on the input to guess the algorithm used, but this only works on generic hashes. For example, if you take any hash output and rotate each letter by 1, most pattern matching schemes will fail.


What you have posted is 16 bytes (128 bits) of base 64 encoded data. The fact that it is base 64 encoded doesn't tell us much because base 64 is not an encryption/hashing algorithm it is a way to encode binary data into text. This means that this block includes one useful piece of information, namely that the output is 16 bytes long. We can compare this to the block size of commonly used schemes and figure out what it can't be. By far the most common schemes are:

  • SHA-1 (160 bits)
  • MD5 (128 bits)
  • AES (128 bits)
  • DES (64 bits)
  • 3DES (64 bits)

The next thing we need to do is to look at other blocks of cipher text to figure out the answer to the following question:

  • Are all cipher texts the same length, even for different input lengths?

If not all blocks are the same length then you aren't looking at a hashing algorithm, but an encryption one. Since the output will always be a multiple of the underlying block size the presence of a block that is not evenly divisible by 16 bytes would mean that it cant be AES and therefore must be DES or 3DES.

If you have the ability to put in a password and observe the output this can be determined very quickly. Just put in a 17 character password and look at the length. If its 16 bytes you have MD5, 20 bytes means SHA-1, 24 bytes means DES or 3DES, 32 bytes means AES.