HMACSHA512 versus Rfc2898DeriveBytes for password hash

Rfc2898DeriveBytes implements PBKDF2: a function which turns a password (with a salt) into an arbitrary-length sequence of bytes. PBKDF2 is often used for password hashing (i.e. to compute and store a value which is sufficient to verify a password) because it has the needed characteristics for password hashing functions: a salt and configurable slowness.

These characteristics are needed because passwords are weak: they fit in human brain. As such, they are vulnerable to exhaustive search: it is feasible, on a general basis, to enumerate most passwords that human users will come up with and remember. The attack assumes that the attacker got a copy of the salt and the hashed password, and then will "try passwords" on his own machine. That's called an offline dictionary attack.

In your case, you have a third element: a validation key. It is a key, i.e. supposedly secret. If the attacker could grab the salts and hashed passwords but not the validation key, then he cannot perform the dictionary attack on his own machines; under these conditions (the validation key remains secret, and the validation algorithm is robust -- HMAC/SHA-512 is fine for that), the configurable slowness of PBKDF2 is not needed. This kind of validation with a secret key is sometimes called "peppering".

Note, though, that when we assume that the attacker could grab a copy of the hashed passwords, then it becomes a matter of delicacy to suppose that the key remained unsullied by his vile glances. This depends on the context. Most SQL injection attacks will be able to read part of all of the database, but not the rest of the files on the machine. Nevertheless, your server must somehow be able to boot up and start without human intervention, so the validation key is somewhere on the disk. An attacker stealing the whole disk (or a backup tape...) will get the validation key as well -- at which point you are back to the need for configurable slowness.

Generally speaking, I would recommend PBKDF2 (aka Rfc2898DeriveBytes in .NET) over a custom construction, although I must say that you appear to use HMAC properly (homemade constructions rarely achieve that level of correctness). If you insist of having a "validation key" (and you are ready to assume the procedural overhead of key management, e.g. special backups for that key), then I suggest using PBKDF2 and then applying HMAC on the PBKDF2 output.

See this answer for a detailed discussion on password hashing.


I'm responding in specific to the EDIT of lessons learned in the original question.

Rfc2898DeriveBytes is called with 1000 iterations. using 1024 byte output. The password size in Db was designed 2k fortunately. Average sample in tests on workstations was around 300 msecs

Quick summary: If you like your current CPU load and Rfc2898DeriveBytes, then change from 1000 iterations and a 1024 byte output to 52000 iterations and a 20 byte output (20 bytes is the native output of SHA-1, which is what .NET 4.5 Rfc2898DeriveBytes is based on).

The explanation of why, including references:

To avoid giving attackers an advantage over you, do NOT use a PBKDF2/RFC2898/PKCS#5 (or an HMAC, which is used internally in PBKDF2 et al.) with an output size larger than the native output of the hash function used. Since .NET's implementation (as of even 4.5) hardcodes SHA-1, you should use a maximum of 160 bits of output, not 8192 bits of output!

The reason for this is that as we refer to the RFC2898 spec, if the output size (dkLen, i.e. Derived Key Length) is greater than the native hash output size (hLen, i.e. Hash Length). On page 9, we see

Step 2: "Let l be the number of hLen-octet blocks in the derived key, rounding up"

Step 3: " T_1 = F (P, S, c, 1) , T_2 = F (P, S, c, 2) , ... T_l = F (P, S, c, l) , "

And on page 10: Step 4: "DK = T_1 || T_2 || ... || T_l<0..r-1>" where DK is the derived key (PBKDF2 output) and || is the concatenation operator.

Therefore, we can see that for your 8192 bit key and .NET's HMAC-SHA-1, we have 8192/160 = 51.2 blocks, and CEIL(51.2) = 52 blocks requied for PBKDF2, i.e. T_1 through T_52 (l = 52). Spec references to what happens with the .2 differently from a full block is outside the scope of this discussion (hint: truncation after the full result is calculated).

Thus, you're running a set of 1000 iterations on your password a total of 52 times, and concatenating the output. Thus, for one password, you're actually running 52000 iterations!

A smart attacker is going to run only 1000 iterations, and compare their 160 bit result to the first 160 bits of your 8192 bits of output - if it fails, it's a wrong guess, move on. If it succeeds, it's almost certainly a successful guess (attack).

Thus, you're running 52,000 iterations on a CPU, and an attacker is running 1,000 iterations on whatever they have (probably a few GPU's, which are each massively outpacing your CPU for SHA-1 in the first place); you've given attackers a 52:1 advantage over and above hardware advantages.

Thankfully, a good PBKDF2 function is easy to adjust; simply change your output length to 160 bits and your number of iterations to 52,000, and you'll use the same amount of CPU time, store smaller keys, and make it 52 times more expensive for any attackers at no runtime cost to yourself!

If you want to further nerf attackers with GPU's, you may wish to change over to PBKDF2-HMAC-SHA-512 (or scrypt or bcrypt) and an output size of 64 bytes or less (the native output size of SHA-512), which significantly reduces the amount of advantage current (early 2014) GPU's have over CPU's due to 64-bit instructions being in CPU's but not GPU's. This is not natively available in .NET 4.5, however.

  • However, @Jither created a nice example of adding PBKDF2-HMAC-SHA256, PBKDF2-HMAC-SHA384, PBKDF2-HMAC-SHA512, etc. capabilities to .NET, and I've included a variant with a reasonable set of test vectors in my Github repository for reference.

For another reference related to an actual design flaw in 1Password, see this Hashcat forum thread - "For each iteration of PBKDF2-HMAC-SHA1 you call 4 times the SHA1 transform. But this is only to produce a 160 bit key. To produce the required 320 bit key, you call it 8 times."

P.S. if you so choose, for a small design change you can save a little more database space if you store the output in a VARBINARY or BINARY column instead of doing the Base64 encode.

P.P.S. i.e. change the test code in your edit as follows (2000*52 = 104000); note that your text said 1000 and your test listed 2000, so text to text and code to code, so mote it be.

//var hash = libCrypto.HashEncode2(password, salt, 2000);
var hash = libCrypto.HashEncode2(password, salt, 104000);
// skip down into HashEncode2
//byte[] hash = deriver.GetBytes(1024);
byte[] hash = deriver.GetBytes(20);

The SHA2 family is not a good choice for password storage. It is significantly better than md5, but really you should be using bcrypt (or scrypt!).

RNGCryptoServiceProvider is a good source of entropy. Ideally a salt is not base 64, but base 256, as in an entire byte. To understand this better, you need to know how rainbow tables are generated. The input to rainbow table generation requires a keyspace. For example, an rainbow could be generated to lookup: alpha-numeric-symbol from 7-12 characters long. Now to crack this proposed salt scheme the attacker would have to generate a alpha-numeric-symbol with 71-76 characters long to compensate for the 64 character salt (which is large). Making the salt a full byte, would greatly increase the keyspace the rainbow table would have to exhaust.