how could I intercept linux sys calls?

Why can't you / don't want to use the LD_PRELOAD trick?

Example code here:

/*
 * File: soft_atimes.c
 * Author: D.J. Capelis
 *
 * Compile:
 * gcc -fPIC -c -o soft_atimes.o soft_atimes.c
 * gcc -shared -o soft_atimes.so soft_atimes.o -ldl
 *
 * Use:
 * LD_PRELOAD="./soft_atimes.so" command
 *
 * Copyright 2007 Regents of the University of California
 */

#define _GNU_SOURCE
#include <dlfcn.h>
#define _FCNTL_H
#include <sys/types.h>
#include <bits/fcntl.h>
#include <stddef.h>

extern int errorno;

int __thread (*_open)(const char * pathname, int flags, ...) = NULL;
int __thread (*_open64)(const char * pathname, int flags, ...) = NULL;

int open(const char * pathname, int flags, mode_t mode)
{
    if (NULL == _open) {
        _open = (int (*)(const char * pathname, int flags, ...)) dlsym(RTLD_NEXT, "open");
    }
    if(flags & O_CREAT)
        return _open(pathname, flags | O_NOATIME, mode);
    else
        return _open(pathname, flags | O_NOATIME, 0);
}

int open64(const char * pathname, int flags, mode_t mode)
{
    if (NULL == _open64) {
        _open64 = (int (*)(const char * pathname, int flags, ...)) dlsym(RTLD_NEXT, "open64");
    }
    if(flags & O_CREAT)
        return _open64(pathname, flags | O_NOATIME, mode);
    else
        return _open64(pathname, flags | O_NOATIME, 0);
}

From what I understand... it is pretty much the LD_PRELOAD trick or a kernel module. There's not a whole lot of middle ground unless you want to run it under an emulator which can trap out to your function or do code re-writing on the actual binary to trap out to your function.

Assuming you can't modify the program and can't (or don't want to) modify the kernel, the LD_PRELOAD approach is the best one, assuming your application is fairly standard and isn't actually one that's maliciously trying to get past your interception. (In which case you will need one of the other techniques.)


Valgrind can be used to intercept any function call. If you need to intercept a system call in your finished product then this will be no use. However, if you are try to intercept during development then it can be very useful. I have frequently used this technique to intercept hashing functions so that I can control the returned hash for testing purposes.

In case you are not aware, Valgrind is mainly used for finding memory leaks and other memory related errors. But the underlying technology is basically an x86 emulator. It emulates your program and intercepts calls to malloc/free etc. The good thing is, you do not need to recompile to use it.

Valgrind has a feature that they term Function Wrapping, which is used to control the interception of functions. See section 3.2 of the Valgrind manual for details. You can setup function wrapping for any function you like. Once the call is intercepted the alternative function that you provide is then invoked.


First lets eliminate some non-answers that other people have given:

  • Use LD_PRELOAD. Yeah you said "Besides LD_PRELOAD..." in the question but apparently that isn't enough for some people. This isn't a good option because it only works if the program uses libc which isn't necessarily the case.
  • Use Systemtap. Yeah you said "Besides ... Linux Kernel Modules" in the question but apparently that isn't enough for some people. This isn't a good option because you have to load a custom kernal module which is a major pain in the arse and also requires root.
  • Valgrind. This does sort of work but it works be simulating the CPU so it's really slow and really complicated. Fine if you're just doing this for one-off debugging. Not really an option if you're doing something production-worthy.
  • Various syscall auditing things. I don't think logging syscalls counts as "intercepting" them. We clearly want to modify the syscall parameters / return values or redirect the program through some other code.

However there are other possibilities not mentioned here yet. Note I'm new to all this stuff and haven't tried any of it yet so I may be wrong about some things.

Rewrite the code

In theory you could use some kind of custom loader that rewrites the syscall instructions to jump to a custom handler instead. But I think that would be an absolute nightmare to implement.

kprobes

kprobes are some kind of kernel instrumentation system. They only have read-only access to anything so you can't use them to intercept syscalls, only log them.

ptrace

ptrace is the API that debuggers like GDB use to do their debugging. There is a PTRACE_SYSCALL option which will pause execution just before/after syscalls. From there you can do pretty much whatever you like in the same way that GDB can. Here's an article about how to modify syscall paramters using ptrace. However it apparently has high overhead.

Seccomp

Seccomp is a system that is design to allow you to filter syscalls. You can't modify the arguments, but you can block them or return custom errors. Seccomp filters are BPF programs. If you're not familiar, they are basically arbitrary programs that users can run in a kernel-space VM. This avoids the user/kernel context switch which makes them faster than ptrace.

While you can't modify arguments directly from your BPF program you can return SECCOMP_RET_TRACE which will trigger a ptraceing parent to break. So it's basically the same as PTRACE_SYSCALL except you get to run a program in kernel space to decide whether you want to actually intercept a syscall based on its arguments. So it should be faster if you only want to intercept some syscalls (e.g. open() with specific paths).

I think this is probably the best option. Here's an article about it from the same author as the one above. Note they use classic BPF instead of eBPF but I guess you can use eBPF too.

Edit: Actually you can only use classic BPF, not eBPF. There's a LWN article about it.

Here are some related questions. The first one is definitely worth reading.

  • Can eBPF modify the return value or parameters of a syscall?
  • Intercept only syscall with PTRACE_SINGLESTEP
  • Is this is a good way to intercept system calls?
  • Minimal overhead way of intercepting system calls without modifying the kernel

There's also a good article about manipulating syscalls via ptrace here.


Some applications can trick strace/ptrace not to run, so the only real option I've had is using systemtap

Systemtap can intercept a bunch of system calls if need be due to its wild card matching. Systemtap is not C, but a separate language. In basic mode, the systemtap should prevent you from doing stupid things, but it also can run in "expert mode" that falls back to allowing a developer to use C if that is required.

It does not require you to patch your kernel (Or at least shouldn't), and once a module has been compiled, you can copy it from a test/development box and insert it (via insmod) on a production system.

I have yet to find a linux application that has found a way to work around/avoid getting caught by systemtap.