Would an executable need an OS kernel to run?

As someone who has written programs that execute without an OS, I offer a definitive answer.

Would an executable need an OS kernel to run?

That depends on how that program was written and built.
You could write a program (assuming you have the knowledge) that does not require an OS at all.
Such a program is described as standalone.
Boot loaders and diagnostic programs are typical uses for standalone programs.

However the typical program written and built in some host OS environment would default to executing in that same host OS environment.
Very explicit decisions and actions are required to write and build a standalone program.


... the output from the compiler is the machine code (executable) which I thought were instructions to the CPU directly.

Correct.

Recently I was reading up on kernels and I found out that programs cannot access the hardware directly but have to go through the kernel.

That's a restriction imposed by a CPU mode that the OS uses to execute programs, and facilitated by certain build tools such as compilers and libraries.
It is not an intrinsic limitation on every program ever written.


So when we compile some simple source code, say with just a printf() function, and the compilation produces the executable machine code, will each instruction in this machine code be directly executed from memory (once the code is loaded into memory by the OS) or will each command in the machine code still need to go through the OS (kernel) to be executed?

Every instruction is executed by the CPU.
An instruction that is unsupported or illegal (e.g. process has insufficient privilege) will cause an immediate exception, and the CPU will instead execute a routine to handle this unusual condition.

A printf() function should not be used as an example of "simple source code".
The translation from an object-oriented high-level programming language to machine code may not be as trivial as you imply.
And then you choose one of the most complex functions from a runtime library that performs data conversions and I/O.

Note that your question stipulates an environment with an OS (and a runtime library).
Once the system is booted, and the OS is given control of the computer, restrictions are imposed on what a program can do (e.g. I/O must be performed by the OS).
If you expect to execute a standalone program (i.e. without an OS), then you must not boot the computer to run the OS.


... what happens after the machine code is loaded into memory?

That depends on the environment.

For a standalone program, it can be executed, i.e. control is handed over by jumping to the program's start address.

For a program loaded by the OS, the program has to be dynamically linked with shared libraries it is dependent on. The OS has to create an execution space for the process that will execute the program.

Will it go through the kernel or directly talk to the processor?

Machine code is executed by the CPU.
They do not "go through the kernel", but nor do they "talk to the processor".
The machine code (consisting of op code and operands) is an instruction to the CPU that is decoded and the operation is performed.

Perhaps the next topic you should investigate is CPU modes.


The kernel is "just" more code. It's just that that code is a layer that lives between the lowest parts of your system and the actual hardware.

All of it runs directly on the CPU, you just transition up through layers of it to do anything.

Your program "needs" the kernel in just the same way it needs the standard C libraries in order to use the printf command in the first place.

The actual code of your program runs on the CPU, but the branches that code makes to print something on screen go through the code for the C printf function, through various other systems and interpreters, each of which do their own processing to work out just how hello world! actually gets printed on your screen.

Say you have a terminal program running on a desktop window manager, running on your kernel which in turn is running on your hardware.

There's a lot more that goes on but lets keep it simple...

  1. In your terminal program you run your program to print hello world!
  2. The terminal sees that the program has written (via the C output routines) hello world! to the console
  3. The terminal program goes up to the desktop window manager saying "I got hello world! written at me, can you put it at position x, y please?"
  4. The desktop window manager goes up to the kernel with "one of my programs wants your graphics device to put some text at this position, get to it dude!"
  5. The kernel passes the request out to the graphics device driver, which formats it in a way that the graphics card can understand
  6. Depending on how the graphics card is connected other kernel device drivers need to be called to push the data out on physical device buses such as PCIe, handling things like making sure the correct device is selected, and that the data can pass through relevant bridge or converters
  7. The hardware displays stuff.

This is a massive oversimplification for description only. Here be dragons.

Effectively everything you do that needs hardware access, be it display, blocks of memory, bits of files or anything like that has to go through some device driver in the kernel to work out exactly how to talk to the relevant device. Be it a filesystem driver on top of a SATA hard disk controller driver which itself is sitting on top of a PCIe bridge device.

The kernel knows how to tie all these devices together and presents a relatively simple interface for programs to do things without having to know about how to do all of these things themselves.

Desktop window managers provide a layer that means that programs don't have to know how to draw windows and play well with other programs trying to display things at the same time.

Finally the terminal program means that your program doesn't need to know how to draw a window, nor how to talk to the kernel graphics card driver, nor all of the complexity to do with dealing with screen buffers and display timing and actually wiggling the data lines to the display.

It's all handled by layers upon layers of code.


It depends on the environment. In many older (and simpler!) computers, such as the IBM 1401, the answer would be "no". Your compiler and linker emitted a standalone "binary" that ran without any operating system at all. When your program stopped running, you loaded a different one, which also ran with no OS.

An operating system is needed in modern environments because you aren't running just one program at a time. Sharing the CPU core(s), the RAM, the mass storage device, the keyboard, mouse, and display, among multiple programs at once requires coordination. The OS provides that. So in a modern environment your program can't just read and write the disk or SSD, it has to ask the OS to do that on its behalf. The OS gets such requests from all the programs that want to access the storage device, implements about things like access controls (can't allow ordinary users to write to the OS's files), queues them to the device, and sorts out the returned information to the correct programs (processes).

In addition, modern computers (unlike, say, the 1401) support the connection of a very wide variety of I/O devices, not just the ones IBM would sell you in the old days. Your compiler and linker can't possibly know about all of the possibilities. For example, your keyboard might be interfaced via PS/2, or USB. The OS allows you to install device-specific "device drivers" that know how to talk to those devices, but present a common interface for the device class to the OS. So your program, and even the OS, doesn't have to do anything different for getting keystrokes from a USB vs a PS/2 keyboard, or for accessing, say, a local SATA disk vs a USB storage device vs storage that's somewhere off on a NAS or SAN. Those details are handled by device drivers for the various device controllers.

For mass storage devices, the OS provides atop all of those a file system driver that presents the same interface to directories and files regardless of where and how the storage is implemented. And again, the OS worries about access controls and serialization. In general, for example, the same file shouldn't be opened for writing by more than one program at a time without jumping through some hoops (but simultaneous reads are generally ok).

So in a modern general-purpose environment, yes - you really need an OS. But even today there are computers such as real-time controllers that aren't complicated enough to need one.

In the Arduino environment, for example, there isn't really an OS. Sure, there's a bunch of library code that the build environment incorporates into every "binary" it builds. But since there is no persistence of that code from one program to the next, it's not an OS.