How to use SHA1 hashing in C programming

If you have all of your data at once, just use the SHA1 function:

// The data to be hashed
char data[] = "Hello, world!";
size_t length = strlen(data);

unsigned char hash[SHA_DIGEST_LENGTH];
SHA1(data, length, hash);
// hash now contains the 20-byte SHA-1 hash

If, on the other hand, you only get your data one piece at a time and you want to compute the hash as you receive that data, then use the other functions:

// Error checking omitted for expository purposes

// Object to hold the current state of the hash
SHA_CTX ctx;
SHA1_Init(&ctx);

// Hash each piece of data as it comes in:
SHA1_Update(&ctx, "Hello, ", 7);
...
SHA1_Update(&ctx, "world!", 6);
// etc.
...
// When you're done with the data, finalize it:
unsigned char hash[SHA_DIGEST_LENGTH];
SHA1_Final(hash, &ctx);

I believe I should be using either unsigned char *SHA1 or SHA1_Init ...

For later versions of the OpenSSL library, like 1.0.2 and 1.1.0, the project recommends using the EVP interface. An example of using EVP Message Digests with SHA256 is available on the OpenSSL wiki:

#define handleErrors abort

EVP_MD_CTX *ctx;

if((ctx = EVP_MD_CTX_create()) == NULL)
    handleErrors();

if(1 != EVP_DigestInit_ex(ctx, EVP_sha256(), NULL))
    handleErrors();

unsigned char message[] = "abcd .... wxyz";
unsinged int message_len = sizeof(message);

if(1 != EVP_DigestUpdate(ctx, message, message_len))
    handleErrors();

unsigned char digest[EVP_MAX_MD_SIZE];
unsigned int digest_len = sizeof(digest);

if(1 != EVP_DigestFinal_ex(ctx, digest, &digest_len))
    handleErrors();

EVP_MD_CTX_destroy(ctx);

They're two different ways to achieve the same thing.

Specifically, you either use SHA_Init, then SHA_Update as many times as necessary to pass your data through and then SHA_Final to get the digest, or you SHA1.

The reason for two modes is that when hashing large files it is common to read the file in chunks, as the alternative would use a lot of memory. Hence, keeping track of the SHA_CTX - the SHA context - as you go allows you to get around this. The algorithm internally also fits this model - that is, data is passed in block at a time.

The SHA method should be fairly straightforward. The other works like this:

unsigned char md[SHA_DIGEST_LENGTH];
SHA_CTX context;
int SHA1_Init(&context);

for ( i = 0; i < numblocks; i++ )
{
    int SHA1_Update(&context, pointer_to_data, data_length);
}
int SHA1_Final(md, &context);

Crucially, at the end md will contain the binary digest, not a hexadecimal representation - it's not a string and shouldn't be used as one.


The first function (SHA1()) is the higher-level one, it's probably the one you want. The doc is pretty clear on the usage - d is input, n is its size and md is where the result is placed (you alloc it).

As for the other 3 functions - these are lower level and I'm pretty sure they are internally used by the first one. They are better suited for larger inputs that need to be processed in a block-by-block manner.

Tags:

C

Hash