Which file in kernel specifies fork(), vfork()... to use sys_clone() system call

The fork() and vfork() wrappers in glibc are implemented via the clone() system call. To better understand the relationship between fork() and clone(), we must consider the relationship between processes and threads in Linux.

Traditionally, fork() would duplicate all the resources owned by the parent process and assign the copy to the child process. This approach incurs considerable overhead, which all might be for nothing if the child immediately calls exec(). In Linux, fork() utilizes copy-on-write pages to delay or altogether avoid copying the data that can be shared between the parent and child processes. Thus, the only overhead that is incurred during a normal fork() is the copying of the parent's page tables and the assignment of a unique process descriptor struct, task_struct, for the child.

Linux also takes an exceptional approach to threads. In Linux, threads are merely ordinary processes which happen to share some resources with other processes. This is a radically different approach to threads compared to other operating systems such as Windows or Solaris, where processes and threads are entirely different kinds of beasts. In Linux, each thread has an ordinary task_struct of its own that just happens to be setup in such a way that it shares certain resources, such as an address space, with the parent process.

The flags parameter of the clone() system call includes a set of flags which indicate which resources, if any, the parent and child processes should share. Processes and threads are both created via clone(), the only difference is the set of flags that is passed to clone().

A normal fork() could be implemented as:

clone(SIGCHLD, 0);

This creates a task which does not share any resources with its parent, and is set to send the SIGCHLD termination signal to the parent when it exits.

In contrast, a task which shares the address space, filesystem resources, file descriptors and signal handlers with the parent, in other words a thread, could be created with:

clone(CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_SIGHAND, 0);

vfork() in turn is implemented via a separate CLONE_VFORK flag, which will cause the parent process to sleep until the child process wakes it via a signal. The child will be the sole thread of execution in the parent's namespace, until it calls exec() or exits. The child is not allowed to write to the memory. The corresponding clone() call could be as follows:

clone(CLONE_VFORK | CLONE_VM | SIGCHLD, 0)

The implementation of sys_clone() is architecture specific, but the bulk of the work happens in do_fork() defined in kernel/fork.c. This function calls the static clone_process(), which creates a new process as a copy of the parent, but does not start it yet. clone_process() copies the registers, assigns a PID to the new task, and either duplicates or shares appropriate parts of the process environment as specified by the clone flags. When clone_process() returns, do_clone() will wake the newly created process and schedule it to run.


The component responsible for translating userland system call functions to kernel system calls under Linux is the libc. In GLibC, the NPTL library redirects this to the clone(2) system call.