How does linux work with symbolic links?

The symlink points to the name of the real file (inode) in the file system. When the system resolves that symlink to find the actual file and open it, it finds and uses the file's inode. At that point, the path you used to get to the file doesn't matter. What the OS doesn't cache, it reads from the file by its inode. You could, as I understand, start reading the file through a hard link and remove that hard link ~~(as long as the file is still linked from somewhere else)~~, and it wouldn't cause problems as long as the file has been resolved (name string->inode).

A symbolic link is a small file that contains the location (i.e. path and filename) of a target file, with a flag in the directory entry indicating that it's a symlink.

When you open a symlink, the OS will follow the location to find the target file. If the target is itself a symlink, it follows its location as well (1)(2) until the location points to a file that's not a symlink (let's call it the FinalFile). Then the OS obtains the inode of the FinalFile (the inode contains metadata like modification-time and has also a pointer to the file's data). Finally the inode of the FinalFile is opened. From now on the process uses that inode to read/write to the file. As a result changing the symlink name or path, deleting the symlink, changing the path or the name of the FinalFile or even deleting the FinalFile(3) has no effect on the process; it's still reading from the same inode.

In most cases file-data operations on the symlink will affect the FinalFile (e.g. reading and writing to the symlink will read from/write to the FinalFile) but there are exceptions: the readlink() system call reads the contents of the symlink itself.

File-metadata operations (like rename or delete) on the other hand will usually affect the symlink. But there are exceptions here as well: the lstat() system call is like stat(), except that it returns information on the symlink itself rather than on the FinalFile(2).

(1) There's a limit on the number of levels and things get a bit more complex if the location in the symlink is a relative path.

(2) Read symlink(7): symbolic link handling for more details. man 7 symlink

(3) The rm command or the unlink() system call doesn't physically remove a file. It removes the directory entry that points to the inode of the file. The file itself is removed only if both a) there are no more directory entries (hard links) that refer to its inode and b) no process has the file open.

That is almost transparent for Linux, and it is much more relationed to the filesystem you are using than the operational system.

It is not a regular file, or a very small file because you cannot create a working symbolic link in a VFAT partition for example by just copying the symbolic link itself to it, because it is recorded directly by the filesystem.

The difference in the symbolic link to a hard link is that the appointement is to a hard link instead of poiting to the data sectors like a hard link does.

Example:

Test 1:

echo 'data' >file.txt

This will create the hard link file.txt pointing to sectors 10 to 20* (*numbers just for explaining).

Test 2:

Now what if ?

ln file.txt file_2.txt

This created a hardlink file_2.txt pointing to sectors 10 to 20 (the same of file.txt), so if you delete file.txt, sectors 10 to 20 are still reserved, and you can see data inside file_2.txt... . (file.txt and file_2.txt are both like the originals)

Test 3:

ln -s file.txt file_sym.txt

Pointed symbolic link file_sym.txt to the hard link file.txt, so when you try to access file_sym.txt you will see file.txt, but if you delete file.txt file_sym will not find the target anymore.

Those are managed by the filesystem, for example by the ext4 modules for linux (or if it is compiled on the kernel), it does not matter if you are using Linux or other Unix.

How does linux work with symbolic links?

Tags:

Symlink

Filesystems

Related

Recent Posts