What's the last character in a file?

ASCII control characters have definitions from the 1960s (actually preceding what you might consider a network). Not all of those control characters are used in the way that they were defined for telecommunications equipment back then.

On Unix-like systems, there is no need for an EOF character; none is used. The system can tell applications how many bytes are in a file:

  • On some other systems (seen in VMS, DOS, Windows), a control-Z may act as an end-of-file marker because in older versions the system could not tell some applications how many bytes are in the file.

    In the case of VMS, the limitation was due to the way the C runtime worked. Assembly-language applications could (and did) get the correct file size.

  • Unix systems in the shell conventionally use control-D to tell an application that an end of input (file) has been reached, but the control-D is not stored in the file.

In C, EOF is purposely made -1 to indicate that it is not a valid character. Standard I/O returns EOF when an end-of-file condition is detected — not a special character.

By the way, files need not end with a newline (ASCII line-feed) character. Text editors can cope with files which are all printable text but lack a trailing newline.


A file does not end with an End of File character, as the previous answers correctly state. But I think the answers and comments contain some inaccuracies worth pointing out:

  • The ASCII character set does not contain an exact EOF character. There are several "end" control characters: End of Text (3), End of Transmission (4), End of Transmission Block (23), End of Medium (25). File Separator (28) maybe comes closest to an EOF character. Code 26 is "Substitute", not EOF.

  • Ctrl-D is only associated with terminal input. For example the command cat filea fileb filec > outfile does not involve Ctrl-D. By the way, you can change the terminal EOF character to something else than Ctrl-D using the stty command.

  • Strictly speaking, Ctrl-D (or whatever you have changed to) is not an EOF key code. What it does is make the read system call return with what input is available, just like pressing return makes the read system call return a line of characters to the caller. By convention a return value of zero from the read system call (i.e. zero characters read) signals an end of file condition. However, the input file is not closed automatically, and, if the input comes from the terminal, it is not put in an "end of file" state. You can write a program that continues reading from the terminal even after an "end of file" and the read call can return non-zero for the next input line.

  • The analogy between the eof and eol characters can be seen if Ctrl-D is pressed when some input has already been written on the line. For example, if you write "abc" and the press Ctrl-D the read call returns, this time with a return value of 3 and with "abc" stored in the buffer passed as argument. Because read does not return 0, this is not interpreted as an EOF condition by the convention above. Similarly, pressing return to makes the read call return with the whole input line (including newline). You can try this out with the cat command: write some characters on the line and press Ctrl-D. You'll see the characters echoed back to you and cat waiting for more input.

  • All the above only applies when the terminal is in the "cooked" mode, as opposed to "raw" mode, in which line input processing is minimized. In raw mode a Ctrl-D character really is delivered to the input buffer.


EOF is not a character. It is a state which indicates no more characters to read from a file stream. When you enter EOF command from the terminal, you are signalling the OS to close the input stream, not putting in a special character.

Tags:

Files