How does Linux differentiate between real and unexisting (eg: device) files?

So there are basically two different types of thing here:

  1. Normal filesystems, which hold files in directories with data and metadata, in the familiar manner (including soft links, hard links, and so on). These are often, but not always, backed by a block device for persistent storage (a tmpfs lives in RAM only, but is otherwise identical to a normal filesystem). The semantics of these are familiar; read, write, rename, and so forth, all work the way you expect them to.
  2. Virtual filesystems, of various kinds. /proc and /sys are examples here, as are FUSE custom filesystems like sshfs or ifuse. There's much more diversity in these, because really they just refer to a filesystem with semantics that are in some sense 'custom'. Thus, when you read from a file under /proc, you aren't actually accessing a specific piece of data that's been stored by something else writing it earlier, as under a normal filesystem. You're essentially doing a kernel call, requesting some information that's generated on-the-fly. And this code can do anything it likes, since it's just some function somewhere implementing read semantics. Thus, you have the weird behavior of files under /proc, like for instance pretending to be symlinks when they aren't really.

The key is that /dev is actually, usually, one of the first kind. It's normal in modern distributions to have /dev be something like a tmpfs, but in older systems, it was normal to have it be a plain directory on disk, without any special attributes. The key is that the files under /dev are device nodes, a type of special file similar to FIFOs or Unix sockets; a device node has a major and minor number, and reading or writing them is doing a call to a kernel driver, much like reading or writing a FIFO is calling the kernel to buffer your output in a pipe. This driver can do whatever it wants, but it usually touches hardware somehow, e.g. to access a hard disk or play sound in the speakers.

To answer the original questions:

  1. There are two questions relevant to whether the 'file exists' or not; these are whether the device node file literally exists, and whether the kernel code backing it is meaningful. The former is resolved just like anything on a normal filesystem. Modern systems use udev or something like it to watch for hardware events and automatically create and destroy the device nodes under /dev accordingly. But older systems, or light custom builds, can just have all their device nodes literally on the disk, created ahead of time. Meanwhile, when you read these files, you're doing a call to kernel code which is determined by the major and minor device numbers; if these aren't reasonable (for instance, you're trying to read a block device that doesn't exist), you'll just get some kind of I/O error.

  2. The way it works out what kernel code to call for which device file varies. For virtual filesystems like /proc, they implement their own read and write functions; the kernel just calls that code depending on which mount point it's in, and the filesystem implementation takes care of the rest. For device files, it's dispatched based on the major and minor device numbers.


Here's a file listing of /dev/sda1 on my nearly up-to-date Arch Linux server:

% ls -li /dev/sda1
1294 brw-rw---- 1 root disk 8, 1 Nov  9 13:26 /dev/sda1

So the directory entry in /dev/ for sda has an inode number, 1294. It's a real file on disk.

Look at where the file size usually appears. "8, 1" appears instead. This is a major and minor device number. Also note the 'b' in the file permissions.

The file /usr/include/ext2fs/ext2_fs.h contains this (fragment) C struct:

/*
 * Structure of an inode on the disk
 */
struct ext2_inode {
    __u16   i_mode;     /* File mode */

That struct shows us the on-disk structure of a file's inode. Lots of interesting stuff is in that struct; take a long look at it.

The i_mode element of struct ext2_inode has 16 bits, and it uses only 9 for the user/group/other, read/write/execute permissions, and another 3 for setuid, setgid, and sticky. It's got 4 bits to differentiate among types like "plain file", "link", "directory", "named pipe", "Unix family socket", and "block device".

The Linux kernel can follow the usual directory lookup algorithm, then make a decision based on the permissions and flags in the i_mode element. For 'b', block device files, it can find the major and minor device numbers, and traditionally, use the major device number to look up a pointer to some kernel function (a device driver) that deals with disks. The minor device number usually gets used as say, the SCSI bus device number, or the EIDE device number or something like that.

Some other decisions about how to deal with a file like /proc/cpuinfo are made based on the filesystem type. If you do a:

% mount | grep proc 
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)

you can see that /proc has file system type of "proc". Reading from a file in /proc causes the kernel to do something different based on the type of the file system, just as opening a file on a ReiserFS or DOS file system would cause the kernel to use different functions to locate files, and locate data of the files.


At the end of the day they are all files for Unix, that´s the beauty of the abstraction.

The way the files are handled by the kernel, now that is a diferent story.

/proc and nowadays /dev and /run (aka /var/run) are virtual filesystems in RAM. /proc is an interface/windows to kernel variables and structures.

I recommend reading The Linux Kernel http://tldp.org/LDP/tlk/tlk.html and Linux Device Drivers, Third Edition https://lwn.net/Kernel/LDD3/.

I also enjoyed The Design and Implementation of the FreeBSD Operating System http://www.amazon.com/Design-Implementation-FreeBSD-Operating-System/dp/0321968972/ref=sr_1_1

Have a look at the relevant page that is pertaining to your question.

http://www.tldp.org/LDP/tlk/dd/drivers.html