Why use cpio for initramfs?

Quoting Documentation/filesystems/ramfs-rootfs-initramfs.txt:

Why cpio rather than tar?

This decision was made back in December, 2001. The discussion started here:

http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1538.html

And spawned a second thread (specifically on tar vs cpio), starting here:

http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1587.html

The quick and dirty summary version (which is no substitute for reading the above threads) is:

1) cpio is a standard. It's decades old (from the AT&T days), and already widely used on Linux (inside RPM, Red Hat's device driver disks). Here's a Linux Journal article about it from 1996:

http://www.linuxjournal.com/article/1213

It's not as popular as tar because the traditional cpio command line tools require _truly_hideous_ command line arguments. But that says nothing either way about the archive format, and there are alternative tools, such as:

http://freecode.com/projects/afio

2) The cpio archive format chosen by the kernel is simpler and cleaner (and thus easier to create and parse) than any of the (literally dozens of) various tar archive formats. The complete initramfs archive format is explained in buffer-format.txt, created in usr/gen_init_cpio.c, and extracted in init/initramfs.c. All three together come to less than 26k total of human-readable text.

3) The GNU project standardizing on tar is approximately as relevant as Windows standardizing on zip. Linux is not part of either, and is free to make its own technical decisions.

4) Since this is a kernel internal format, it could easily have been
something brand new. The kernel provides its own tools to create and extract this format anyway. Using an existing standard was preferable, but not essential.

5) Al Viro made the decision (quote: "tar is ugly as hell and not going to be supported on the kernel side"):

http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1540.html

explained his reasoning:

http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1550.html http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1638.html

and, most importantly, designed and implemented the initramfs code.


I'm not 100% sure, but as the initial ramdisk needs to be unpacked by the kernel during boot, cpio is used because it is already implemented in kernel code.


From what I remember of my old SysV days, cpio could handle dev files, but tar could not; this made cpio the 'raw' backup utility of choice before dump came around. It was also easier to handle partial filesets and hard links so incremental backups were easier. I think that GNU tar has caught up with cpio features so now it is just a matter of user comfortability. Both cpio and tar should be installed by default.