Debugging Linux Kernel with QEMU

Depending on the distribution you'd like to use, there are various ways to create a file system image, e.g. this article walks you through the laborious way to a "Linux from Scratch" system.

In general, you'd either create a QEMU image using qemu-img, fetch some distribution's installation media and use QEMU with the installation medium to prepare the image (this page explains the process for Debian GNU/Linux) or use an image prepared by someone else.

This section of the QEMU Wikibook contains all the information you need.

Edit: As Gilles' answer to the linked question suggests, you don't need a full-blown root file system for testing, you could just use an initrd image (say, Arch Linux's initrd like here)


QEMU + GDB step-by-step procedure tested on Ubuntu 16.10 host

To get started from scratch quickly I've made a minimal fully automated QEMU + Buildroot example at: https://github.com/cirosantilli/linux-kernel-module-cheat Major steps are covered below.

First get a root filesystem rootfs.cpio.gz. If you need one, consider:

  • a minimal init-only executable image: Custom Linux Distro that runs just one program, nothing else | Unix & Linux Stack Exchange
  • a Busybox interactive system: What is the smallest possible Linux implementation? | Unix & Linux Stack Exchange

Then on the Linux kernel:

git checkout v4.9
make mrproper
make x86_64_defconfig
cat <<EOF >.config-fragment
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_KERNEL=y
CONFIG_GDB_SCRIPTS=y
EOF
./scripts/kconfig/merge_config.sh .config .config-fragment
make -j"$(nproc)"
qemu-system-x86_64 -kernel arch/x86/boot/bzImage \
                   -initrd rootfs.cpio.gz -S -s

On another terminal, supposing you want to start debugging from start_kernel:

gdb \
    -ex "add-auto-load-safe-path $(pwd)" \
    -ex "file vmlinux" \
    -ex 'set arch i386:x86-64:intel' \
    -ex 'target remote localhost:1234' \
    -ex 'break start_kernel' \
    -ex 'continue' \
    -ex 'disconnect' \
    -ex 'set arch i386:x86-64' \
    -ex 'target remote localhost:1234'

and we are done!!

For kernel modules see: How to debug Linux kernel modules with QEMU? | Stack Overflow

For Ubuntu 14.04, GDB 7.7.1, hbreak was needed, break software breakpoints were ignored. Not the case anymore in 16.10. See also: https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/901944

The messy disconnect and what come after it are to work around the error:

Remote 'g' packet reply is too long: 000000000000000017d11000008ef4810120008000000000fdfb8b07000000000d352828000000004040010000000000903fe081ffffffff883fe081ffffffff00000000000e0000ffffffffffe0ffffffffffff07ffffffffffffffff9fffff17d11000008ef4810000000000800000fffffffff8ffffffffff0000ffffffff2ddbf481ffffffff4600000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000007f0300000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000801f0000

Related threads:

  • https://sourceware.org/bugzilla/show_bug.cgi?id=13984 might be a GDB bug
  • gdb - Remote 'g' packet reply is too long | Stack Overflow
  • http://wiki.osdev.org/QEMU_and_GDB_in_long_mode osdev.org is as usual an awesome source for these problems
  • https://lists.nongnu.org/archive/html/qemu-discuss/2014-10/msg00069.html

See also:

  • https://github.com/torvalds/linux/blob/v4.9/Documentation/dev-tools/gdb-kernel-debugging.rst official Linux kernel "documentation"
  • How to debug the Linux kernel with GDB and QEMU? | Stack Overflow

Known limitations:

  • the Linux kernel does not support (and does not even compile without patches) with -O0: How to de-optimize the Linux kernel to and compile it with -O0? | Stack Overflow
  • GDB 7.11 will blow your memory on some types of tab completion, even after the max-completions fix: Tab completion interrupt for large binaries | Stack Overflow Likely some corner case which was not covered in that patch. So an ulimit -Sv 500000 is a wise action before debugging. Blew up specifically when I tab completed file<tab> for the filename argument of sys_execve as in: Can the sys_execve() system call in the Linux kernel receive both absolute or relative paths? | Stack Overflow