How does /proc/* work?

It's actually pretty simple, at least if you don't need the implementation details.

First off, on Linux all file systems (ext2, ext3, btrfs, reiserfs, tmpfs, zfs, ...) are implemented in the kernel. Some may offload work to userland code through FUSE, and some come only in the form of a kernel module (native ZFS is a notable example of the latter due to licensing restrictions), but either way there remains a kernel component. This is an important basic.

When a program wants to read from a file, it will issue various system library calls which ultimately end up in the kernel in the form of an open(), read(), close() sequence (possibly with seek() thrown in for good measure). The kernel takes the provided path and filename, and through the file system and device I/O layer translates these to physical read requests (and in many cases also write requests -- think for example atime updates) to some underlying storage.

However, it doesn't have to translate those requests specifically to physical, persistent storage. The kernel's contract is that issuing that particular set of system calls will provide the contents of the file in question. Where exactly in our physical realm the "file" exists is secondary to this.

On /proc is usually mounted what is known as procfs. That is a special file system type, but since it is a file system, it really is no different from e.g. an ext3 file system mounted somewhere. So the request gets passed to the procfs file system driver code, which knows about all these files and directories and returns particular pieces of information from the kernel data structures.

The "storage layer" in this case is the kernel data structures, and procfs provides a clean, convenient interface to accessing those. Do keep in mind that mounting procfs at /proc is simply convention; you could just as easily mount it elsewhere. In fact, that is sometimes done, for example in chroot jails when the process running there needs access to /proc for some reason.

It works the same way if you write a value to some file; at the kernel level, that translates to a series of open(), seek(), write(), close() calls which again get passed to the file system driver; again, in this particular case, the procfs code.

The particular reason why you see file returning empty is that many of the files exposed by procfs are exposed with a size of 0 bytes. The 0 byte size is likely an optimization on the kernel side (many of the files in /proc are dynamic and can easily vary in length, possibly even from one read to the next, and calculating the length of each file on every directory read would potentially be very expensive). Going by the comments to this answer, which you can verify on your own system by running through strace or a similar tool, file first issues a stat() call to detect any special files, and then takes the opportunity to, if the file size is reported as 0, abort and report the file as being empty.

This behavior is actually documented and can be overridden by specifying -s or --special-files on the file invocation, although as stated in the manual page that may have side effects. The quote below is from the BSD file 5.11 man page, dated Oct 17 2011.

Normally, file only attempts to read and determine the type of argument files which stat(2) reports are ordinary files. This prevents problems, because reading special files may have peculiar consequences. Specifying the -s option causes file to also read argument files which are block or character special files. This is useful for determining the filesystem types of the data in raw disk partitions, which are block special files. This option also causes file to disregard the file size as reported by stat(2) since on some systems it reports a zero size for raw disk partitions.


In this directory, you can control how the kernel views devices, adjust kernel settings, add devices to the kernel and remove them again. In this directory you can directly view the memory usage and I/O statistics.

You can see which disks are mounted and what file systems are used. In short, every single aspect of your Linux system can be examined from this directory, if you know what to look for.

The /proc directory is not a normal directory. If you were to boot from a boot CD and look at that directory on your hard drive, you would see it as being empty. When you look at it under your normal running system it can be quite large. However, it doesn't seem to be using any hard disk space. This is because it is a virtual file system.

Since the /proc file system is a virtual file system and resides in memory, a new /proc file system is created every time your Linux machine reboots.

In other words, it is just a means of easily peeking and poking at the guts of the Linux system through a file and directory type interface. When you look at a file in the /proc directory, you are looking directly at a range of memory in the Linux kernel and seeing what it can see.

The layers in the file system

Enter image description here

Examples:

  • Inside /proc, there is a directory for each running process, named with its process ID. These directories contain files that have useful information about the processes, such as:
    • exe: which is a symbolic link to the file on disk the process was started from.
    • cwd: which is a symbolic link to the working directory of the process.
    • wchan: which, when read, returns the waiting channel the process is on.
    • maps: which, when read, returns the memory maps of the process.
  • /proc/uptime returns the uptime as two decimal values in seconds, separated by a space:
    • the amount of time since the kernel was started.
    • the amount of time that the kernel has been idle.
  • /proc/interrupts: For information related to interrupts.
  • /proc/modules: For a list of modules.

For more detailed information see man proc or kernel.org.


You are correct, they are not real files.

In simplest terms, it's a way to talk to the kernel using the normal methods of reading and writing files, instead of calling the kernel directly. It's in line with Unix's "everything is a file" philosophy.

The files in /proc don't physically exist anywhere, but the kernel reacts to the files you read and write within there, and instead of writing to storage, it reports information or does something.

Similarly, the files in /dev aren't really files in the traditional sense (although on some systems the files in /dev may actually exist on disk, they won't have much to them other than what device they refer to) - they enable you to talk to a device using the normal Unix file I/O API - or anything that uses it, like shells

Tags:

Linux

Proc