How does a Segmentation Fault work under-the-hood?

All modern CPUs have the capacity to interrupt the currently-executing machine instruction. They save enough state (usually, but not always, on the stack) to make it possible to resume execution later, as if nothing had happened (the interrupted instruction will be restarted from scratch, usually). Then they start executing an interrupt handler, which is just more machine code, but placed at a special location so the CPU knows where it is in advance. Interrupt handlers are always part of the kernel of the operating system: the component that runs with the greatest privilege and is responsible for supervising execution of all the other components.1,2

Interrupts can be synchronous, meaning that they are triggered by the CPU itself as a direct response to something the currently-executing instruction did, or asynchronous, meaning that they happen at an unpredictable time because of an external event, like data arriving on the network port. Some people reserve the term "interrupt" for asynchronous interrupts, and call synchronous interrupts "traps", "faults", or "exceptions" instead, but those words all have other meanings so I'm going to stick with "synchronous interrupt".

Now, most modern operating systems have a notion of processes. At its most basic, this is a mechanism whereby the computer can run more than one program at the same time, but it is also a key aspect of how operating systems configure memory protection, which is is a feature of most (but, alas, still not all) modern CPUs. It goes along with virtual memory, which is the ability to alter the mapping between memory addresses and actual locations in RAM. Memory protection allows the operating system to give each process its own private chunk of RAM, that only it can access. It also allows the operating system (acting on behalf of some process) to designate regions of RAM as read-only, executable, shared among a group of cooperating processes, etc. There will also be a chunk of memory that is only accessible by the kernel.3

As long as each process accesses memory only in the ways that the CPU is configured to allow, memory protection is invisible. When a process breaks the rules, the CPU will generate a synchronous interrupt, asking the kernel to sort things out. It regularly happens that the process didn't really break the rules, only the kernel needs to do some work before the process can be allowed to continue. For instance, if a page of a process's memory needs to be "evicted" to the swap file in order to free up space in RAM for something else, the kernel will mark that page inaccessible. The next time the process tries to use it, the CPU will generate a memory-protection interrupt; the kernel will retrieve the page from swap, put it back where it was, mark it accessible again, and resume execution.

But suppose that the process really did break the rules. It tried to access a page that has never had any RAM mapped to it, or it tried to execute a page that is marked as not containing machine code, or whatever. The family of operating systems generally known as "Unix" all use signals to deal with this situation.4 Signals are similar to interrupts, but they are generated by the kernel and fielded by processes, rather than being generated by the hardware and fielded by the kernel. Processes can define signal handlers in their own code, and tell the kernel where they are. Those signal handlers will then execute, interrupting the normal flow of control, when necessary. Signals all have a number and two names, one of which is a cryptic acronym and the other a slightly less cryptic phrase. The signal that's generated when the a process breaks the memory-protection rules is (by convention) number 11, and its names are SIGSEGV and "Segmentation fault".5,6

An important difference between signals and interrupts is that there is a default behavior for every signal. If the operating system fails to define handlers for all interrupts, that is a bug in the OS, and the entire computer will crash when the CPU tries to invoke a missing handler. But processes are under no obligation to define signal handlers for all signals. If the kernel generates a signal for a process, and that signal has been left at its default behavior, the kernel will just go ahead and do whatever the default is and not bother the process. Most signals' default behaviors are either "do nothing" or "terminate this process and maybe also produce a core dump." SIGSEGV is one of the latter.

So, to recap, we have a process that broke the memory-protection rules. The CPU suspended the process and generated a synchronous interrupt. The kernel fielded that interrupt and generated a SIGSEGV signal for the process. Let's assume the process did not set up a signal handler for SIGSEGV, so the kernel carries out the default behavior, which is to terminate the process. This has all the same effects as the _exit system call: open files are closed, memory is deallocated, etc.

Up till this point nothing has printed out any messages that a human can see, and the shell (or, more generally, the parent process of the process that just got terminated) has not been involved at all. SIGSEGV goes to the process that broke the rules, not its parent. The next step in the sequence, though, is to notify the parent process that its child has been terminated. This can happen in several different ways, of which the simplest is when the parent is already waiting for this notification, using one of the wait system calls (wait, waitpid, wait4, etc). In that case, the kernel will just cause that system call to return, and supply the parent process with a code number called an exit status.7 The exit status informs the parent why the child process was terminated; in this case, it will learn that the child was terminated due to the default behavior of a SIGSEGV signal.

The parent process may then report the event to a human by printing a message; shell programs almost always do this. Your crsh doesn't include code to do that, but it happens anyway, because the C library routine system runs a full-featured shell, /bin/sh, "under the hood". crsh is the grandparent in this scenario; the parent-process notification is fielded by /bin/sh, which prints its usual message. Then /bin/sh itself exits, since it has nothing more to do, and the C library's implementation of system receives that exit notification. You can see that exit notification in your code, by inspecting the return value of system; but it won't tell you that the grandchild process died on a segfault, because that was consumed by the intermediate shell process.


Footnotes

  1. Some operating systems don't implement device drivers as part of the kernel; however, all interrupt handlers still have to be part of the kernel, and so does the code that configures memory protection, because the hardware doesn't allow anything but the kernel to do these things.

  2. There may be a program called a "hypervisor" or "virtual machine manager" that is even more privileged than the kernel, but for purposes of this answer it can be considered part of the hardware.

  3. The kernel is a program, but it is not a process; it is more like a library. All processes execute parts of the kernel's code, from time to time, in addition to their own code. There may be a number of "kernel threads" that only execute kernel code, but they do not concern us here.

  4. The one and only OS you are likely to have to deal with anymore that can't be considered an implementation of Unix is, of course, Windows. It does not use signals in this situation. (Indeed, it does not have signals; on Windows the <signal.h> interface is completely faked by the C library.) It uses something called "structured exception handling" instead.

  5. Some memory-protection violations generate SIGBUS ("Bus error") instead of SIGSEGV. The line between the two is underspecified and varies from system to system. If you've written a program that defines a handler for SIGSEGV, it is probably a good idea to define the same handler for SIGBUS.

  6. "Segmentation fault" was the name of the interrupt generated for memory-protection violations by one of the computers that ran the original Unix, probably the PDP-11. "Segmentation" is a type of memory protection, but nowadays the term "segmentation fault" refers generically to any sort of memory protection violation.

  7. All the other ways the parent process might be notified of a child having terminated, end up with the parent calling wait and receiving an exit status. It's just that something else happens first.


The shell does indeed have something to do with that message, and crsh indirectly calls a shell, which is probably bash.

I wrote a small C program that always seg faults:

#include <stdio.h>

int
main(int ac, char **av)
{
        int *i = NULL;

        *i = 12;

        return 0;
}

When I run it from my default shell, zsh, I get this:

4 % ./segv
zsh: 13512 segmentation fault  ./segv

When I run it from bash, I get what you noted in your question:

bediger@flq123:csrc % ./segv
Segmentation fault

I was going to write a signal handler in my code, then I realized that the system() library call used by crsh exec's a shell, /bin/sh according to man 3 system. That /bin/sh is almost certainly printing out "Segmentation fault", since crsh certainly isn't.

If you re-write crsh to use the execve() system call to run the program, you will not see the "Segmentation fault" string. It comes from the shell invoked by system().


I can't seem to find any information on this aside from "the CPU's MMU sends a signal" and "the kernel directs it to the offending program, terminating it".

This is a bit of a garbled summary. The Unix signal mechanism is entirely different from the CPU-specific events that start the process.

In general, when a bad address is accessed (or written to a read-only area, attempt to execute a non-executable section, etc.), the CPU will generate some CPU-specific event (on traditional non-VM architectures this was called a segmentation violation, since each "segment" (traditionally, the read-only executable "text", the writable and variable-length "data", and the stack traditionally at the opposite end of memory) had a fixed range of addresses - on a modern architecture it's more likely to be a page fault [for unmapped memory] or an access violation [for read, write, and execute permission issues], and I'll focus on this for the rest of the answer).

Now, at this point, the kernel can do several things. Page faults are also generated for memory that is valid but not loaded (e.g. swapped out, or in a mmapped file, etc.), and in this case the kernel will map the memory and then restart the user program from the instruction that caused the error. Otherwise, it sends a signal. This doesn't exactly "direct [the original event] to the offending program", since the process for installing a signal handler is different and mostly architecture-independent, vs. if the program were expected to simulate installing an interrupt handler.

If the user program has a signal handler installed, this means creating a stack frame and setting the user program's execution position to the signal handler. The same is done for all signals, but in the case of a segmentation violation things are generally arranged so that if the signal handler returns it will restart the instruction that caused the error. The user program may have fixed the error, e.g. by mapping memory to the offending address - it's architecture-dependent whether this is possible). The signal handler can also jump to a different location in the program (typically via longjmp or by throwing an exception), to abort whatever operation caused the bad memory access.

If the user program does not have a signal handler installed, it is simply terminated. On some architectures, if the signal is ignored it may restart the instruction over and over, causing an infinite loop.