C# Create a hash for a byte array or image

There's plenty of hashsum providers in .NET which create cryptographic hashes - which satisifies your condition that they are unique (for most purposes collision-proof). They are all extremely fast and the hashing definitely won't be the bottleneck in your app unless you're doing it a trillion times over.

Personally I like SHA1:

public static string GetHashSHA1(this byte[] data)
{
    using (var sha1 = new System.Security.Cryptography.SHA1CryptoServiceProvider())
    {
        return string.Concat(sha1.ComputeHash(data).Select(x => x.ToString("X2")));
    }
}

Even when people say one method might be slower than another, it's all in relative terms. A program dealing with images definitely won't notice the microsecond process of generating a hashsum.

And regarding collisions, for most purposes this is also irrelevant. Even "obsolete" methods like MD5 are still highly useful in most situations. Only recommend not using it when the security of your system relies on preventing collisions.


The part of Rex M's answer about using SHA1 to generate a hash is a good one (MD5 is also a popular option). zvolkov's suggestion about not constantly creating new crypto providers is also a good one (as is the suggestion about using CRC if speed is more important than virtually-guaranteed uniqueness.

However, do not use Encoding.UTF8.GetString() to convert a byte[] into a string (unless of course you know from context that it is valid UTF8). For one, it will reject invalid surogates. A method guaranteed to always give you a valid string from a byte[] is Convert.ToBase64String().


Creating new instance of SHA1CryptoServiceProvider every time you need to compute a hash is NOT fast at all. Using the same instance is pretty fast.

Still I'd rather do one of the many CRC algorithms instead of a cryptographic hash as hash functions designed for cryptography don't work too well for very small hash sizes (32 bit) which is what you want for your GetHash() override (assuming that's what you want).

Check this link out for one example of computing CRC in C#: http://sanity-free.org/134/standard_crc_16_in_csharp.html

P.S. the reason you want your hash to be small (16 or 32 bit) is so you can compare them FAST (that was the whole point of having hashes, remember?). Having hash represented by a 256-bit long value encoded as string is pretty insane in terms of performance.

Tags:

C#

.Net

Image

Hash