Why does hexdump try to read through EOF?

Thanks to @JdeBP for the hint, I was able to create a small testcase that does the same as hexdump:

#include <stdio.h>

int main(void){
        char buf[64]; size_t r;
        for(;;){
                printf("eof=%d, error=%d\n", feof(stdin), ferror(stdin));
                r = fread(buf, 1, sizeof buf, stdin);
                printf("read %zd bytes, eof=%d, error=%d\n",
                        r, feof(stdin), ferror(stdin));
                if(!r) return 0;
        }
}

When run on a glibc based system (typical linux desktop).

prompt$ ./fread-test
eof=0, error=0
<control-D>
read 0 bytes, eof=1, error=0

prompt$ ./fread-test
eof=0, error=0
hello
<control-D>
read 6 bytes, eof=1, error=0
eof=1, error=0
<control-D>
read 0 bytes, eof=1, error=0

When run on bsd, solaris, busybox (uclibc), android, etc:

prompt$ ./fread-test
eof=0, error=0
hello
<control-D>
read 6 bytes, eof=1, error=0
eof=1, error=0
read 0 bytes, eof=1, error=0

Based on my inexpert interpretation of the standard, this looks like a bug in glibc (the GNU C library).

About fread:

For each object, size calls shall be made to the fgetc() function and the results stored, in the order read, in an array of unsigned char exactly overlaying the object.

About fgetc:

If the end-of-file indicator for the input stream pointed to by stream is not set and a next byte is present, the fgetc() function shall obtain the next byte

It seems that glibc will try to "obtain the next byte" even if the eof indicator is set.

Indeed, it actually is a bug in the GNU C library, not present in the BSD or musl C libraries. It was known about in 2005. Ulrich Drepper closed the bug report without fixing the bug in 2007. It was discussed in 2012, where it was noted that other C libraries did not and do not have this behaviour, that the 1999 C Standard is quite specific about it, and that Solaris even has a special mechanism for this which is invoked when c99 is used as the compiler instead of cc.

It was finally fixed in 2018. The fix is in version 2.28 of the GNU C library. The current "stable" version of Debian, version 9, is on version 2.24 of the GNU C library, and this bug thus continues to manifest itself, 14 years after being reported.

As noted in the GNU C library discussions, there is the possibility of softwares that were written to require the quirks of the GNU C library without regard to other C libraries such as musl or the behaviour on other platforms. However, in the aforementioned discussions over the years no such program was identified. Whereas several programs that are broken by the old GNU C library, to require users to signal EOF twice in succession, have been identified; including amongst others hexdump here and patch on StackOverflow back in 2018.

Tags:

Linux

Io

Hexdump