Is it worth storing email addresses as hashes?

Generally speaking, you shouldn't ask and hold user data (especially PII) that you don't need, this is even more true now under GDPR (if it applies in your scenario) but it's always been the case in security. The lesser the data, the lesser the risk.

When you hash passwords, you lose knowledge about its plaintext version and the fact that you're asking whether it's worth hashing emails as well as passwords, makes me think that perhaps you don't really need that information in the first place.

That being said, if you do need emails (for other purposes than logging) then you can't hash them as you would lose that information by doing so. In that case I'd instead recommending encrypting/authenticating your data using AES/HMAC or Chacha20/Poly1305 or similar.

Another approach would be to use PAKE: no emails, no passwords and no need to transfer them over the Internet! An example of this can be SRP or OPAQUE.


The first question you need to ask yourself is, does your service need the email address in the first place and what does it need that email address for?

If you don't need the email address, then don't store it.

If you need to know the email address, and if all of those needs can be satisfied by a hashed version, then it sounds like a good idea to store just a hash.

If you need to know the email address for purposes which cannot be satisfied by a hash, then it's not a good idea to store just a hash. If for example you need to send emails to your users, chances are you cannot do that with just a hash.

A realistic use case for hashed email address.

Imagine a site where users can log in using their email address and password. The user may also have a username, but that's outside the scope of this answer.

When the user logs in they type their email address and password. In order for you to find that email address in your database a hash value would be sufficient. You can simply store just a hash and before you do a lookup you hash the value provided by the user.

If you just did a plain unsalted hash, those values could still be compared across different sites (if for example multiple sites using this approach had data leaks). On the other hand hashing with a unique salt per user like is best practice for passwords wouldn't work either. It would simply be too inefficient to compare the user against every entry in your database.

Instead you can have a site-wide salt that you change infrequently (like once per year) such that each login can be tried with every salt value that you have ever used.

That way you can look up users in your database by email without ever needing to store that email address. Passwords you of course still store using a password hashing with a unique salt for every stored password.

Should you want a feature to send password reset emails, that's possible as well. When the user type in their email to receive a password reset email, you can look it up in the database the same way you'd do for a login.

If you also want the users email to be visible to the user while they are logged in, you can store a cookie in their browser with an encrypted version of their email address using the hash as key.