How does a linux GUI work at the lowest level?

How it works (Gnu/Linux + X11)

Overview

It looks something like this (not draws to scale)

┌───────────────────────────────────────────────┐
│                       User                    │
│     ┌─────────────────────────────────────────┤
│     │             Application                 │
│     │            ┌──────────┬─────┬─────┬─────┤
│     │            │      ... │ SDL │ GTK │ QT  │
│     │            ├──────────┴─────┴─────┴─────┤
│     │            │            xLib            │
│     │            ├────────────────────────────┤
├─────┴───┬────────┴──┐         X11             │
│   Gnu   │ Libraries │        Server           │
│   Tools │           │                         │
├─────────┘           │                         │ 
├─────────────────────┤                         │
│   Linux (kernel)    │                         │
├─────────────────────┴─────────────────────────┤
│                    Hardware                   │
└───────────────────────────────────────────────┘

We see from the diagram that X11 talks mostly with the hardware. However it needs to talk via the kernel, to initially get access to this hardware.

I am a bit hazy on the detail (and I think it changed since I last looked into it). There is a device /dev/mem that gives access to the whole of memory (I think physical memory), as most of the graphics hardware is memory mapped, this file (see everything is a file) can be used to access it. X11 would open the file (kernel uses file permissions to see if it can do this), then X11 uses mmap to map the file into virtual memory (make it look like memory), now the memory looks like memory. After mmap, the kernel is not involved.

X11 needs to know about the various graphics hardware, as it accesses it directly, via memory.

(this may have changes, specifically the security model, may no longer give access to ALL of the memory.)

Linux

At the bottom is Linux (the kernel): a small part of the system. It provides access to hardware, and implements security.

Gnu

Then Gnu (Libraries; bash; tools:ls, etc; C compiler, etc). Most of the operating system.

X11 server (e.g. x.org)

Then X11 (Or Wayland, or ...), the base GUI subsystem. This runs in user-land (outside of the kernel): it is just another process, with some privileges. The kernel does not get involved, except to give access to the hardware. And providing inter-process communication, so that other processes can talk with the X11 server.

X11 library

A simple abstraction to allow you to write code for X11.

GUI libraries

Libraries such as qt, gtk, sdl, are next — they make it easier to use X11, and work on other systems such as wayland, Microsoft's Windows, or MacOS.

Applications

Applications sit on top of the libraries.

Some low-level entry points, for programming

xlib

Using xlib, is a good way to learn about X11. However do some reading about X11 first.

SDL

SDL will give you low level access, direct to bit-planes for you to directly draw to.

Going lower

If you want to go lower, then I am not sure what good current options are, but here are some ideas.

  • Get an old Amiga, or simulator. And some good documentation. e.g. https://archive.org/details/Amiga_System_Programmers_Guide_1988_Abacus/mode/2up (I had 2 books, this one and similar).
  • Look at what can be done on a raspberry pi. I have not looked into this.

Links

X11

https://en.wikipedia.org/wiki/X_Window_System

Modern ways

Writing this got my interest, so I had a look at what the modern fast way to do it is. Here are some links:

https://blogs.igalia.com/itoral/2014/07/29/a-brief-introduction-to-the-linux-graphics-stack/


ctrl-alt-delor's answer gives you a good overview of the general architecture. For a more hands-on approach, I give you an answer regarding "nothing but the linux kernel and programming in C".

I like writing to the frame-buffer directly every now and then. The frame-buffer device driver will do all the tedious close-to-the-hardware "how will this eventually end up on a screen" stuff for you. You can do so right away with a root shell:

echo -n -e '\x00\x00\xFF' > /dev/fb0

It sets the very first (top left) pixel to red on my 32 bit framebuffer:

Screenshot of the framebuffer with the top left pixel red

You can totally do so from within C by opening /dev/fb0 and write bytes. Memory mapping can become your friend. This does only work without an X server or in a virtual console. Press Ctrl+Alt+F1 to access it.

PS: Visualising random data like your mouse movement can also be fun:

cat /dev/input/mouse0 > /dev/fb0

PPS: Please also note that virtually any real-world desktop application wants more direct access to the hardware for some fancy stuff like hardware acceleration for drawing, 3D and video rendering. The simple frame-buffer device won't do any of this well.


I would strongly recommend starting with ncurses.

Unlike more complex graphical systems, it is based purely on text, so there is no need to get bogged down in the details of screen drivers and graphics libraries. However the basic principles of putting windows on a screen, moving focus between windows, and so on, still hold true. And you can still do some drawing, at the level of single character blocks and ASCII art.

Of course you're still building this on top of a library, but it's a library which you can easily understand. And more than that, it's a library where the source code is freely available, fairly well documented, and not too impenetrable if you want to read it. You can even modify it yourself if you want to. Or you could look at all the library functions in there to find what the API needs to be, and write it yourself from scratch based on that design.