MD5 Hash and Base64 encoding

As per http://en.wikipedia.org/wiki/Base64

"Note that given an input of n bytes, the output will be (n + 2 - ((n + 2) % 3)) / 3 * 4 bytes long, which converges to n * 4 / 3 or 1.33333n for large n."

So, it will be ((32 + 2 - (32 + 2) % 3)) / 3 * 4 = 34 - (34 % 3) / 3 * 4 = (34 - 1) / 3 * 4 = 33/3*4 = 44 characters.

You could always extract it in raw binary form (128 bits) and encode it directly into base 64, which means converting 16 bytes instead of 32, which becomes 24 bytes when base 64 encoded.


An MD5 value is always 22 (useful) characters long in Base64 notation. Many Base64 algorithms will also append 2 characters of padding when encoding an MD5 hash, bringing the total to 24 characters. The padding adds no useful information and can be discarded. Only the first 22 characters matter.

Here's why:

An MD5 hash is a 128-bit value. Every character in a Base64 string contains 6 bits of information, because there are 64 possible values for the character, and it takes 6 powers of 2 to reach 64. With 6 bits of information in every character, 21 characters has 126 bits of information, and 22 characters contains 132 bits of information. Since 128 bits cannot fit within 21 characters but does fit within 22 characters (with a little room to spare), a 128-bit value will always be represented as 22 characters in Base64.

A note on the padding:

I mentioned above that many Base64 encoding algorithms add a couple of characters of padding when encoding an MD5 value. This is because Base64 represents 3 bytes of information as 4 characters. Since MD5 has 16 bytes of information, many Base64 encoding algorithms append "==" to designate that the input of 16 bytes was 2 bytes short of the next multiple of 3, which would have been 18 bytes. These 2 equal signs add no information whatsoever to the string, and can be discarded when storing.