per process private file system mount points

Running unshare -m gives the calling process a private copy of its mount namespace, and also unshares file system attributes so that it no longer shares its root directory, current directory, or umask attributes with any other process.

So what does the above paragraph say? Let us try and understand using a simple example.

Terminal 1:

I do the below commands in the first terminal.

#Creating a new process
unshare -m /bin/bash
#creating a new mount point
secret_dir=`mktemp -d --tmpdir=/tmp`
#creating a new mount point for the above created directory. 
mount -n -o size=1m -t tmpfs tmpfs $secret_dir
#checking the available mount points. 
grep /tmp /proc/mounts 

The last command gives me the output as,

tmpfs /tmp/tmp.7KtrAsd9lx tmpfs rw,relatime,size=1024k 0 0

Now, I did the following commands as well.

cd /tmp/tmp.7KtrAsd9lx
touch hello
touch helloagain
ls - lFa

The output of the ls command is,

ls -lFa
total 4
drwxrwxrwt   2 root root   80 Sep  3 22:23 ./
drwxrwxrwt. 16 root root 4096 Sep  3 22:22 ../
-rw-r--r--   1 root root    0 Sep  3 22:23 hello
-rw-r--r--   1 root root    0 Sep  3 22:23 helloagain

So what is the big deal in doing all this? Why should I do it?

I open another terminal now (terminal 2) and do the below commands.

cd /tmp/tmp.7KtrAsd9lx
ls - lFa

The output is as below.

ls -lFa
total 8
drwx------   2 root root 4096 Sep  3 22:22 ./
drwxrwxrwt. 16 root root 4096 Sep  3 22:22 ../

The files hello and helloagain are not visible and I even logged in as root to check these files. So the advantage is, this feature makes it possible for us to create a private temporary filesystem that even other root-owned processes cannot see or browse through.

From the man page of unshare,

mount namespace Mounting and unmounting filesystems will not affect the rest of the system (CLONE_NEWNS flag), except for filesystems which are explicitly marked as shared (with mount --make-shared; see /proc/self/mountinfo for the shared flags).

It's recommended to use mount --make-rprivate or mount --make-rslave after unshare --mount to make sure that mountpoints in the new namespace are really unshared from the parental namespace.

The memory being utilized for the namespace is VFS which is from kernel. And - if we set it up right in the first place - we can create entire virtual environments in which we are the root user without root permissions.

References:

The example is framed using the details from this blog post. Also, the quotes of this answer are from this wonderful explanation from Mike. Another wonderful read regarding this can be found from the answer from here.


If you have bubblewrap installed on your system, you can do it easily in one step:

bwrap --dev-bind / / --tmpfs /tmp bash

In the example above, inner bash will have its own view on /tmp.

Solution inspired by @Ramesh-s answer - thanks for it!