System calls overhead

For example, if we consider getpid(), when a system call is made to getpid() my guess is that if the control is currently in the child process then a context switching has to be made to enter the parent process to get the pid.

No context switch to the child process should be necessary here — the kernel should have all of the necessary data available to itself. In most cases, the kernel will only switch contexts to a userspace process in the scheduler, or when returning from a system call.

Also when getpid() is called, there will be some metadata transfer across the user-space boundary and enters and exits the kernel. So will the constant switching between user space and kernel also cause some overhead?

Yes, if getpid() was being called often, the overhead would certainly build up. There are some approaches available which can avoid this overhead for simple "getter" system calls like getpid() and gettimeofday(); one such approach which was at one point used under Linux was to store the (known) result of the system call in a special memory page. (This mechanism was known as vsyscall.)


Pardon for generalizing (rather than qualifying every sentence).

A call to a system service (such as returning process information) has a user mode shell. This shell triggers an exception that gets routed through the system dispatch table that invokes the kernel mode system service.

The switch to kernel mode requires something similar to a process context switch. For example, it requires changing from the user stack to the kern stack (and other system dependent changes).

The calling process supplies a user mode return buffer. The system system service will check to make sure that it is a valid user mode buffer before writing the response data for security purposes.

A library function like getpid that only returns information about the current process may not require a switch to kernel mode.


I've done some more precise benchmarking at an x86-64 Linux (compiled with -O3):

ns    relative(rounded) function
4.89  1      regular_function  //just a value return
6.05  1      getpid   //glibc caches this one (forks invalidate the cached value)
17.7  4      sysconf(_SC_PAGESIZE)
22.6  5      getauxval(AT_EUID)
25.4  5      sysconf(_SC_NPROCESSORS_ONLN)
27.1  6      getauxval(AT_UID)
54.1  11     gettimeofday
235   48     geteuid
261   53     getuid
264   54     getppid
314   64     sysconf(_SC_OPEN_MAX)
622   127    pread@0 // IO funcs benchmarked with 1 bytes quantities
638   130    read    // through a 1 Gigabyte file
1690  346    write
1710  350    pwrite@0

The cheapest "syscalls" are the ones that go through the auxiliary vector (~20–30ns). The calls in the middle (~250–310ns) should reflect the average overhead most accurately as there shouldn't be much work to be done in the kernel with them.

For comparison, malloc+free pairs with small size requests (<64 bytes => no system calls) cost about 70-80ns (see my answer at Cost of static memory allocation vs dynamic memory allocation in C).

https://softwareengineering.stackexchange.com/questions/311165/why-isnt-there-generic-batching-syscall-in-linux-bsd/350173 has some interesting ideas about how the syscall overhead could be minimized.