Does tail read the whole file?
tail doesn't read the whole file, it seeks to the end then read blocks backwards until the expected number of lines have been reached, then it displays the lines in the proper direction until the end of the file, and possibly stays monitoring the file if the
-f option is used.
Note however that
tail has no choice but to read the whole data if provided a non seekable input, for example when reading from a pipe.
Similarily, when asked to look for lines starting from the beginning of the file, with using the
tail -n +linenumber syntax or
tail +linenumber non standard option when supported,
tail obviously reads the whole file (unless interrupted).
You could have seen how
tail works yourself. As you can for one of my files
read is done three times and in total roughly 10K bytes are read:
strace 2>&1 tail ./huge-file >/dev/null | grep -e "read" -e "lseek" -e "open" -e "close" open("./huge-file", O_RDONLY) = 3 lseek(3, 0, SEEK_CUR) = 0 lseek(3, 0, SEEK_END) = 80552644 lseek(3, 80551936, SEEK_SET) = 80551936 read(3, ""..., 708) = 708 lseek(3, 80543744, SEEK_SET) = 80543744 read(3, ""..., 8192) = 8192 read(3, ""..., 708) = 708 close(3) = 0
Since a file might be scattered on a disk I imagine it has to [read the file sequentially], but I do not understand such internals well.
As you now know,
tail just seeks to the end of the file (with the system call
lseek), and works backwards. But in the remark quoted above, you're wondering "how does tail know where on disk to find the end of the file?"
The answer is simple: Tail does not know. User-level processes see files as continuous streams, so all
tail can know is the offset from the start of the file. But in the filesystem, the file's "inode" (directory entry) is associated with a list of numbers denoting the physical location of the file's data blocks. When you read from the file, the kernel / the device driver figures out which part you need, works out its location on disk and fetches it for you.
That's the kind of thing we have operating systems for: so you don't have to worry about where your file's blocks are scattered.