How to mmap the stack for the clone() system call on linux?

You'd want the MAP_ANONYMOUS flag for mmap. And the MAP_GROWSDOWN since you want to make use it as a stack.

Something like:

void *stack = mmap(NULL,initial_stacksize,PROT_WRITE|PROT_READ,MAP_PRIVATE|MAP_GROWSDOWN|MAP_ANONYMOUS,-1,0);

See the mmap man page for more info. And remember, clone is a low level concept, that you're not meant to use unless you really need what it offers. And it offers a lot of control - like setting it's own stack - just in case you want to do some trickering(like having the stack accessible in all the related processes). Unless you have very good reason to use clone, stick with fork or pthreads.


Stacks are not, and never can be, unlimited in their space for growth. Like everything else, they live in the process's virtual address space, and the amount by which they can grow is always limited by the distance to the adjacent mapped memory region.

When people talk about the stack growing dynamically, what they might mean is one of two things:

  • Pages of the stack might be copy-on-write zero pages, which do not get private copies made until the first write is performed.
  • Lower parts of the stack region may not yet be reserved (and thus not count towards the process's commit charge, i.e. the amount of physical memory/swap the kernel has accounted for as reserved for the process) until a guard page is hit, in which case the kernel commits more and moves the guard page, or kills the process if there is no memory left to commit.

Trying to rely on the MAP_GROWSDOWN flag is unreliable and dangerous because it cannot protect you against mmap creating a new mapping just adjacent to your stack, which will then get clobbered. (See http://lwn.net/Articles/294001/) For the main thread, the kernel automatically reserves the stack-size ulimit worth of address space (not memory) below the stack and prevents mmap from allocating it. (But beware! Some broken vendor-patched kernels disable this behavior leading to random memory corruption!) For other threads, you simply must mmap the entire range of address space the thread might need for stack when creating it. There is no other way. You could make most of it initially non-writable/non-readable, and change that on faults, but then you'd need signal handlers and this solution is not acceptable in a POSIX threads implementation because it would interfere with the application's signal handlers. (Note that, as an extension, the kernel could offer special MAP_ flags to deliver a different signal instead of SIGSEGV on illegal access to the mapping, and then the threads implementation could catch and act on this signal. But Linux at present has no such feature.)

Finally, note that the clone syscall does not take a stack pointer argument because it does not need it. The syscall must be performed from assembly code, because the userspace wrapper is required to change the stack pointer in the "child" thread to point to the desired stack, and avoid writing anything to the parent's stack.

Actually, clone does take a stack pointer argument, because it's unsafe to wait to change stack pointer in the "child" after returning to userspace. Unless signals are all blocked, a signal handler could run immediately on the wrong stack, and on some architectures the stack pointer must be valid and point to an area safe to write at all times.

Not only is modifying the stack pointer impossible from C, but you also couldn't avoid the possibility that the compiler would clobber the parent's stack after the syscall but before the stack pointer was changed.