multiprocessing fork() vs spawn()

  1. is it that the fork is much quicker 'cuz it does not try to identify which resources to copy?

Yes, it's much quicker. The kernel can clone the whole process and only copies modified memory-pages as a whole. Piping resources to a new process and booting the interpreter from scratch is not necessary.

  1. is it that, since fork duplicates everything, it would "waste" much more resources comparing to spawn()?

Fork on modern kernels does only "copy-on-write" and it only affects memory-pages which actually change. The caveat is that "write" already encompasses merely iterating over an object in CPython. That's because the reference-count for the object gets incremented.

If you have long running processes with lots of small objects in use, this can mean you waste more memory than with spawn. Anecdotally I recall Facebook claiming to have memory-usage reduced considerably with switching from "fork" to "spawn" for their Python-processes.


There's a tradeoff between 3 multiprocessing start methods:

  1. fork is faster because it does a copy-on-write of the parent process's entire virtual memory including the initialized Python interpreter, loaded modules, and constructed objects in memory.

    But fork does not copy the parent process's threads. Thus locks (in memory) that in the parent process were held by other threads are stuck in the child without owning threads to unlock them, ready to cause a deadlock when code tries to acquire any of them. Also any native library with forked threads will be in a broken state.

    The copied Python modules and objects might be useful or they might needlessly bloat every forked child process.

    The child process also "inherits" OS resources like open file descriptors and open network ports. Those can also lead to problems but Python works around some of them.

    So fork is fast, unsafe, and maybe bloated.

    However these safety problems might not cause trouble depending on what the child process does.

  2. spawn starts a Python child process from scratch without the parent process's memory, file descriptors, threads, etc. Technically, spawn forks a duplicate of the current process, then the child immediately calls exec to replace itself with a fresh Python, then asks Python to load the target module and run the target callable.

    So spawn is safe, compact, and slower since Python has to load, initialize itself, read files, load and initialize modules, etc.

    However it might not be noticeably slower compared to the work that the child process does.

  3. forkserver forks a duplicate of the current Python process that trims down to approximately a fresh Python process. This becomes the "fork server" process. Then each time you start a child process, it asks the fork server to fork a child and run its target callable.

    Those child processes all start out compact and without stuck locks.

    forkserver is more complicated and not well documented. Bojan Nikolic's blog post explains more about forkserver and its secret set_forkserver_preload() method to preload some modules. Be wary of using an undocumented method, esp. before the bug fix in Python 3.7.0.

    So forkserver is fast, compact, and safe, but it's more complicated and not well documented.

[The docs aren't great on all this so I've combined info from multiple sources and made some inferences. Do comment on any mistakes.]