What is the NSFS filesystem?

As described in the kernel commit log linked to by jiliagre above, the nsfs filesystem is a virtual filesystem making Linux-kernel namespaces available. It is separate from the /proc "proc" filesystem, where some process directory entries reference inodes in the nsfs filesystem in order to show which namespaces a certain process (or thread) is currently using.

The nsfs doesn't get listed in /proc/filesystems (while proc does), so it cannot be explicitly mounted. mount -t nsfs ./namespaces fails with "unknown filesystem type". This is, as nsfs as it is tightly interwoven with the proc filesystem.

The filesystem type nsfs only becomes visible via /proc/$PID/mountinfo when bind-mounting an existing(!) namespace filesystem link to another target. As Stephen Kitt rightly suggests above, this is to keep namespaces existing even if no process is using them anymore.

For example, create a new user namespace with a new network namespace, then bind-mount it, then exit: the namespace still exists, but lsns won't find it, since it's not listed in /proc/$PID/ns anymore, but exists as a (bind) mount point.

# bind mount only needs an inode, not necessarily a directory ;)
touch mynetns
# create new network namespace, show its id and then bind-mount it, so it
# is kept existing after the unshare'd bash has terminated.
# output: net:[##########]
NS=$(sudo unshare -n bash -c "readlink /proc/self/ns/net && mount --bind /proc/self/ns/net mynetns") && echo $NS
# notice how lsns cannot see this namespace anymore: no match!
lsns -t net | grep ${NS:5:-1} || echo "lsns: no match for net:[${NS:5:-1}]"
# however, findmnt does locate it on the nsfs...
findmnt -t nsfs | grep ${NS:5:-1} || echo "no match for net:[${NS:5:-1}]"
# output: /home/.../mynetns nsfs[net:[##########]] nsfs rw
# let the namespace go...
echo "unbinding + releasing network namespace"
sudo umount mynetns
findmnt -t nsfs | grep ${NS:5:-1} || echo "findmnt: no match for net:[${NS:5:-1}]"
# clean up
rm mynetns

Output should be similar to this one:

net:[4026532992]
lsns: no match for net:[4026532992]
/home/.../mynetns nsfs[net:[4026532992]] nsfs   rw
unbinding + releasing network namespace
findmnt: no match for net:[4026532992]

Please note that it is not possible to create namespaces via the nsfs filesystem, only via the syscalls clone() (CLONE_NEW...) and unshare. The nsfs only reflects the current kernel status w.r.t. namespaces, but it cannot create or destroy them.

Namespaces automatically get destroyed whenever there isn't any reference to them left, no processes (so no /proc/$PID/ns/...) AND no bind-mounts either, as we've explored in the above example.


That's the "Name Space File System", used by the setns system call and, as its source code shows, Name Space related ioctl's (e.g. NS_GET_USERNS, NS_GET_OWNER_UID...)

NSFS pseudo-files entries used to be provided by the /proc file system until Linux 3.19. Here is the commit of this change.

See Stephen Kitt's comment about a possible explanation about this files presence.