When returning the difference between pointers of char strings, how important is the order of casting and dereferencing?

On a two's complement system (which is pretty much all of them), it won't make a difference.

The first example--*(unsigned char *)x-- will simply interpret the binary value of the data stored at the location as an unsigned char, so if the decimal value stored at the location is -1, then hex value (assuming CHAR_BIT=8) stored is 0xFF and then it will be simply be interpreted as 255 as it fits the hex representation.

The second example (assuming char is signed on this compiler)--(unsigned char)*x-- will first grab the value stored at the location and then cast it to unsigned. So we get -1 and in casting it to unsigned char, the standard states that to translate a negative signed number to an unsigned value, you add one more than the max value storable by that type to the negative value as much as necessary until you have a value within its range. So you get -1 + 256 = 255

However, if you somehow were on a one's complement system, things go a bit differently.

Again, using *(unsigned char *)x, we reinterpret the hex representation of -1 as an unsigned char, but this time the hex representation is 0xFE, which will be interpreted as 254 rather than 255.

Going back to (unsigned char)*x, it will still just take take perform the -1 + 256 to get the end result of 255.

All that said, I'm not sure if the 8th bit of a char can be used by a character encoding by the C standard. I know it's not used in ASCII-encoded strings, which again is what you will most likely be working with, so you likely won't come across any negative values when comparing actual strings.


Converting from signed to unsigned can be found in the C11 standard at section 6.3.1.3:

  1. When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.

  2. Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.


And why choose one over the other?

The below makes a difference with non 2's complement in an interesting way.

// #1
return (*(unsigned char *)s1 - *(unsigned char *)s2);
// *2
return ((unsigned char)*s1 - (unsigned char)*s2);

Integer non-2's complement encoding (all but extinct theses days), had a bit-pattern that was either -0 or a trap representation.

If code used (unsigned char)*s1 when s1 pointed to such, either the -0 would become a sign-less 0 or a trap could happen.

With -0 becoming an unsigned char, that would lose arithmetic distinction from a null character - the character at the end of a stings.
In C, a null character is a "byte with all bits set to 0".

To prevent that, (*(unsigned char *)s1 is used.

C requires it:

7.24.1 String function conventions
For all functions in this subclause, each character shall be interpreted as if it had the type unsigned char (and therefore every possible object representation is valid and has a different value). C17dr § 7.24.1.3

To that end, OP's code has a bug. With non-2's compliment, *s1 should not stop the loop as a -0.

// while (*s1 == *s2 && *s1 && n > 1)
while ((*(unsigned char *)s1 == (*(unsigned char *)s2 && (*(unsigned char *)s1 && n > 1)

For the pedantic, a char may be the same size as an int. Some graphics processors have done this. In such cases, to prevent overflow, the following can be used. Works for the usual 8-bit char too.

// return (*(unsigned char *)s1 - *(unsigned char *)s2);
return (*(unsigned char *)s1 > *(unsigned char *)s2) - 
       (*(unsigned char *)s1 < *(unsigned char *)s2);

Alternative

int strncmp(const char *s1, const char *s2, size_t n) {
  const unsigned char *u1 = (const unsigned char *) s1;
  const unsigned char *u2 = (const unsigned char *) s2;
  if (n == 0) {
      return (0);
  }
  while (*u1 == *u2 && *u1 && n > 1) {
      n--;
      u1++;
      u2++;
  }
  return (*u1 > *u2) - (*u1 < *u2);
}