systemctl start trafficserver wait for start

Solution 1:

This is because systemctl start returns immediately, without waiting for traffic server to be actually started.

Is there a way to tell systemctl start to only return once the service is started?

systemctl start does wait for the service to be ready (except if invoked with --no-block), the service just needs to indicate that properly (i. e., not use Type=simple). If the service doesn’t tell systemd when it’s ready, no variation of systemctl is-active, systemctl show, etc. will help you.

The most elegant solution, as mentioned in the comments, would be a socket unit. systemd starts the socket, traffic_line connects to it, systemd starts the service, and traffic_line blocks until the service starts to accept connections on the file descriptor it inherited from systemd.

Alternatively, you can use either Type=forking (the service forks, and the main PID exits once the forked service is ready) or Type=notify (the service calls sd_notify(0, "READY=1") once it’s ready).

Unfortunately, all of these solutions require some support from trafficserver – use systemd’s socket instead of allocating its own, fork and wait appropriately in the main process, or call sd_notify. systemd can’t magically guess when the server is ready if the server doesn’t cooperate :)


After looking at trafficserver’s source code a bit, it looks like it might actually support Type=forking – the server is spawned by a dedicated traffic_cop command, which seems to wait until the server is up and perform some basic testing (at least the code looks like it). So if you change the service type, it might just work:

# /etc/systemd/system/trafficserver.service.d/type-forking.conf
[Service]
Type=forking

Solution 2:

I finally got it to work, after several attempts.

First attempt

After digging into systemctl help I found the is-active command:

$ systemctl is-active trafficserver
active

I therefore wrote a shell script to wait until the service becomes active:

while true; do
    if [ $(systemctl is-active trafficserver) == "active" ]; then
        break
    fi

    sleep 1
done

Unfortunately, even though this script works as expected when I test it with start/stop, I was still getting the same error when running the traffic_line commands right after it. I think that the service is reported as active before the actual processes have fully started (probably a matter of milliseconds).

Second attempt

So I tried another way. Knowing that this is the very first start of the service, I can wait until the PID file of the trafficserver manager exists. Here is what I tried:

while [ ! -f /run/trafficserver/manager.lock ]; do
  sleep 1
done

Same problem: when the trafficserver manager's PID file is written, the manager is not actually ready to receive orders yet, so I'm still getting the error.

Damn, I don't want to use a blind sleep.

Third attempt

So I ended up checking that the traffic_line command itself does not fail:

while ! traffic_line --status &> /dev/null; do
    sleep 1
done

And this works!

Nice, but...

Unfortunately, the answer is very specific to the service I'm using (trafficserver), and would not directly apply to other services.

If you know a more generic answer to this question, please feel free to share it.