How does the libuv implementation of *non-blockingness* work exactly?

I think that trying to understand libuv is getting in your way of understanding how reactors (event loops) are implemented in C, and it is this that you need to understand, as opposed to the exact implementation details behind libuv.

(Note that when I say "in C", what I really means is "at or near the system call interface, where userland meets the kernel".)

All of the different backends (select, poll, epoll, etc) are, more-or-less, variations on the same theme. They block the current process or thread until there is work to be done, like servicing a timer, reading from a socket, writing to a socket, or handling a socket error.

When the current process is blocked, it literally is not getting any CPU cycles assigned to it by the OS scheduler.

Part of the issue behind understanding this stuff IMO is the poor terminology: async, sync in JS-land, which don't really describe what these things are. Really, in C, we're talking about non-blocking vs blocking I/O.

When we read from a blocking file descriptor, the process (or thread) is blocked -- prevented from running -- until the kernel has something for it to read; when we write to a blocking file descriptor, the process is blocked until the kernel accepts the entire buffer.

In non-blocking I/O, it's exactly the same, except the kernel won't stop the process from running when there is nothing to do: instead, when you read or write, it tells you how much you read or wrote (or if there was an error).

The select system call (and friends) prevent the C developer from having to try and read from a non-blocking file descriptor over and over again -- select() is, in effect, a blocking system call that unblocks when any of the descriptors or timers you are watching are ready. This lets the developer build a loop around select, servicing any events it reports, like an expired timeout or a file descriptor that can be read. This is the event loop.

So, at its very core, what happens at the C-end of a JS event loop is roughly this algorithm:

while(true) {
  select(open fds, timeout);
  did_the_timeout_expire(run_js_timers());
  for (each error fd)
    run_js_error_handler(fdJSObjects[fd]);
  for (each read-ready fd)
    emit_data_events(fdJSObjects[fd], read_as_much_as_I_can(fd));
  for (each write-ready fd) {
    if (!pendingData(fd))
      break;
    write_as_much_as_I_can(fd);
    pendingData = whatever_was_leftover_that_couldnt_write; 
  }
}

FWIW - I have actually written an event loop for v8 based around select(): it really is this simple.

It's important also to remember that JS always runs to completion. So, when you call a JS function (via the v8 api) from C, your C program doesn't do anything until the JS code returns.

NodeJS uses some optimizations like handling pending writes in a separate pthreads, but these all happen in "C space" and you shouldn't think/worry about them when trying to understand this pattern, because they're not relevant.

You might also be fooled into the thinking that JS isn't run to completion when dealing with things like async functions -- but it absolutely is, 100% of the time -- if you're not up to speed on this, do some reading with respect to the event loop and the micro task queue. Async functions are basically a syntax trick, and their "completion" involves returning a Promise.


I just took a dive into libuv's source code, and found at first that it seems like it does a lot of setup, and not much actual event handling.

Nonetheless, a look into src/unix/kqueue.c reveals some of the inner mechanics of event handling:

int uv__io_check_fd(uv_loop_t* loop, int fd) {
  struct kevent ev;
  int rc;

  rc = 0;
  EV_SET(&ev, fd, EVFILT_READ, EV_ADD, 0, 0, 0);
  if (kevent(loop->backend_fd, &ev, 1, NULL, 0, NULL))
    rc = UV__ERR(errno);

  EV_SET(&ev, fd, EVFILT_READ, EV_DELETE, 0, 0, 0);
  if (rc == 0)
    if (kevent(loop->backend_fd, &ev, 1, NULL, 0, NULL))
      abort();

  return rc;
}

The file descriptor polling is done here, "setting" the event with EV_SET (similar to how you use FD_SET before checking with select()), and the handling is done via the kevent handler.

This is specific to the kqueue style events (mainly used on BSD-likes a la MacOS), and there are many other implementations for different Unices, but they all use the same function name to do nonblocking IO checks. See here for another implementation using epoll.

To answer your questions:

1) Where exactly is the "looping" occuring within libuv?

The QUEUE data structure is used for storing and processing events. This queue is filled by the platform- and IO- specific event types you register to listen for. Internally, it uses a clever linked-list using only an array of two void * pointers (see here):

typedef void *QUEUE[2];

I'm not going to get into the details of this list, all you need to know is it implements a queue-like structure for adding and popping elements.

Once you have file descriptors in the queue that are generating data, the asynchronous I/O code mentioned earlier will pick it up. The backend_fd within the uv_loop_t structure is the generator of data for each type of I/O.

2) What are the key steps in each iteration of the loop that make it non-blocking and async?

libuv is essentially a wrapper (with a nice API) around the real workhorses here, namely kqueue, epoll, select, etc. To answer this question completely, you'd need a fair bit of background in kernel-level file descriptor implementation, and I'm not sure if that's what you want based on the question.

The short answer is that the underlying operating systems all have built-in facilities for non-blocking (and therefore async) I/O. How each system works is a little outside the scope of this answer, I think, but I'll leave some reading for the curious:

https://www.quora.com/Network-Programming-How-is-select-implemented?share=1