What are the minimum root filesystem applications that are required to fully boot linux?

That entirely depends on what services you want to have on your device.

Programs

You can make Linux boot directly into a shell. It isn't very useful in production — who'd just want to have a shell sitting there — but it's useful as an intervention mechanism when you have an interactive bootloader: pass init=/bin/sh to the kernel command line. All Linux systems (and all unix systems) have a Bourne/POSIX-style shell in /bin/sh.

You'll need a set of shell utilities. BusyBox is a very common choice; it contains a shell and common utilities for file and text manipulation (cp, grep, …), networking setup (ping, ifconfig, …), process manipulation (ps, nice, …), and various other system tools (fdisk, mount, syslogd, …). BusyBox is extremely configurable: you can select which tools you want and even individual features at compile time, to get the right size/functionality compromise for your application. Apart from sh, the bare minimum that you can't really do anything without is mount, umount and halt, but it would be atypical to not have also cat, cp, mv, rm, mkdir, rmdir, ps, sync and a few more. BusyBox installs as a single binary called busybox, with a symbolic link for each utility.

The first process on a normal unix system is called init. Its job is to start other services. BusyBox contains an init system. In addition to the init binary (usually located in /sbin), you'll need its configuration files (usually called /etc/inittab — some modern init replacement do away with that file but you won't find them on a small embedded system) that indicate what services to start and when. For BusyBox, /etc/inittab is optional; if it's missing, you get a root shell on the console and the script /etc/init.d/rcS (default location) is executed at boot time.

That's all you need, beyond of course the programs that make your device do something useful. For example, on my home router running an OpenWrt variant, the only programs are BusyBox, nvram (to read and change settings in NVRAM), and networking utilities.

Unless all your executables are statically linked, you will need the dynamic loader (ld.so, which may be called by different names depending on the choice of libc and on the processor architectures) and all the dynamic libraries (/lib/lib*.so, perhaps some of these in /usr/lib) required by these executables.

Directory structure

The Filesystem Hierarchy Standard describes the common directory structure of Linux systems. It is geared towards desktop and server installations: a lot of it can be omitted on an embedded system. Here is a typical minimum.

  • /bin: executable programs (some may be in /usr/bin instead).
  • /dev: device nodes (see below)
  • /etc: configuration files
  • /lib: shared libraries, including the dynamic loader (unless all executables are statically linked)
  • /proc: mount point for the proc filesystem
  • /sbin: executable programs. The distinction with /bin is that /sbin is for programs that are only useful to the system administrator, but this distinction isn't meaningful on embedded devices. You can make /sbin a symbolic link to /bin.
  • /mnt: handy to have on read-only root filesystems as a scratch mount point during maintenance
  • /sys: mount point for the sysfs filesystem
  • /tmp: location for temporary files (often a tmpfs mount)
  • /usr: contains subdirectories bin, lib and sbin. /usr exists for extra files that are not on the root filesystem. If you don't have that, you can make /usr a symbolic link to the root directory.

Device files

Here are some typical entries in a minimal /dev:

  • console
  • full (writing to it always reports “no space left on device”)
  • log (a socket that programs use to send log entries), if you have a syslogd daemon (such as BusyBox's) reading from it
  • null (acts like a file that's always empty)
  • ptmx and a pts directory, if you want to use pseudo-terminals (i.e. any terminal other than the console) — e.g. if the device is networked and you want to telnet or ssh in
  • random (returns random bytes, risks blocking)
  • tty (always designates the program's terminal)
  • urandom (returns random bytes, never blocks but may be non-random on a freshly-booted device)
  • zero (contains an infinite sequence of null bytes)

Beyond that you'll need entries for your hardware (except network interfaces, these don't get entries in /dev): serial ports, storage, etc.

For embedded devices, you would normally create the device entries directly on the root filesystem. High-end systems have a script called MAKEDEV to create /dev entries, but on an embedded system the script is often not bundled into the image. If some hardware can be hotplugged (e.g. if the device has a USB host port), then /dev should be managed by udev (you may still have a minimal set on the root filesystem).

Boot-time actions

Beyond the root filesystem, you need to mount a few more for normal operation:

  • procfs on /proc (pretty much indispensible)
  • sysfs on /sys (pretty much indispensible)
  • tmpfs filesystem on /tmp (to allow programs to create temporary files that will be in RAM, rather than on the root filesystem which may be in flash or read-only)
  • tmpfs, devfs or devtmpfs on /dev if dynamic (see udev in “Device files” above)
  • devpts on /dev/pts if you want to use [pseudo-terminals (see the remark about pts above)

You can make an /etc/fstab file and call mount -a, or run mount manually.

Start a syslog daemon (as well as klogd for kernel logs, if the syslogd program doesn't take care of it), if you have any place to write logs to.

After this, the device is ready to start application-specific services.

How to make a root filesystem

This is a long and diverse story, so all I'll do here is give a few pointers.

The root filesystem may be kept in RAM (loaded from a (usually compressed) image in ROM or flash), or on a disk-based filesystem (stored in ROM or flash), or loaded from the network (often over TFTP) if applicable. If the root filesystem is in RAM, make it the initramfs — a RAM filesystem whose content is created at boot time.

Many frameworks exist for assembling root images for embedded systems. There are a few pointers in the BusyBox FAQ. Buildroot is a popular one, allowing you to build a whole root image with a setup similar to the Linux kernel and BusyBox. OpenEmbedded is another such framework.

Wikipedia has an (incomplete) list of popular embedded Linux distributions. An example of embedded Linux you may have near you is the OpenWrt family of operating systems for network appliances (popular on tinkerers' home routers). If you want to learn by experience, you can try Linux from Scratch, but it's geared towards desktop systems for hobbyists rather than towards embedded devices.

A note on Linux vs Linux kernel

The only behavior that's baked into the Linux kernel is that the first program that's launched at boot time. (I won't get into initrd and initramfs subtleties here.) This program, traditionally called init, has process ID 1 and has certain privileges (immunity to KILL signals) and responsibilities (reaping orphans). You can run a system with a Linux kernel and start whatever you want as the first process, but then what you have is an operating system based on the Linux kernel, and not what is normally called “Linux” — Linux, in the common sense of the term, is a Unix-like operating system whose kernel is the Linux kernel. For example, Android is an operating system which is not Unix-like but based on the Linux kernel.


All you need is one statically linked executable, placed on the filesystem, in isolation. You do not need any other files. That executable is the init process. It can be busybox. That gives you a shell and a host of other utilities, all in itself. You can go to a fully functioning system just by executing commands manually in busybox to mount the root filesystem read-write, create /dev nodes, exec real init, etc.


If you do not need any shell utilities, a statically linked mksh binary (e.g. against klibc – 130K on Linux/i386) will do. You need a /linuxrc or /init or /sbin/init script that just calls mksh -l -T!/dev/tty1 in a loop:

#!/bin/mksh
while true; do
    /bin/mksh -l -T!/dev/tty1
done

The -T!$tty option is a recent addition to mksh that tells it to spawn a new shell on the given terminal and wait for it. (Before that, there was only -T- to dæmonise a programm and -T$tty to spawn on a terminal but not wait for it. This was not so nice.) The -l option simply tells it to run a login shell (which reads /etc/profile, ~/.profile and ~/.mkshrc).

This assumes your terminal is /dev/tty1, substitute. (With more magic, the terminal can automatically be found out. /dev/console will not give you full job control.)

You need a few files in /dev for this to work:

  • /dev/console
  • /dev/null
  • /dev/tty
  • /dev/tty1

Booting with the kernel option devtmpfs.mount=1 eliminates the need for a filled /dev, just let it be an empty directory (suitable for use as a mountpoint).

You'll normally want to have some utilities (from klibc, busybox, beastiebox, toybox or toolbox), but they are not really needed.

You may want to add a ~/.mkshrc file, which sets up $PS1 and some basic shell aliases and functions.

I once made an 171K compressed (371K uncompressed) initrd for Linux/m68k using mksh (and its sample mkshrc file) and klibc-utils only. (This was before -T! was added to the shell, though, so it spawned the login shell on /dev/tty2 instead and echo'd a message to the console telling the user to switch terminals.) It works fine.

This is a really bare minimum setup. The other answers provide excellent advice towards somewhat more featured systems. This is a real special-case thing.

Disclaimer: I'm the mksh developer.