Techniques to Monitor cron tasks?

Solution 1:

My common approach is thus:

  • Don't produce any stdout when your cron'ed application completes successfully.
  • Don't pipe any output to /dev/null.
  • Do produce meaningful stderr output when something goes wrong.
  • Do set a $MAILTO address in the crontab to send that error output to the required team.

Solution 2:

In addition to the other answers:

  • let the job write a timestamp to a file when it finishes along with the return value from the actual job
  • propagate the return value back to the original caller

We use the first to make it easier for Nagios (Icinga) to check, e.g if the last written timestamp is older than n hours (plus whatever logic you need) - we know something went wrong.


Solution 3:

In addition to the above:

  • Do call "logger" along with writing to stderr when something goes wrong. Configure syslog to additionally forward to a central host, aka "loghost". (Logger will use "user.notice" facility by default, but you may change it.)

Solution 4:

There are a couple of techniques you could use for monitoring cronjobs.

To receive alerts of cronjob failures:

  • Use cron's standard MAILTO= function. If a cronjob produces output on STDERR, it will be mailed to the address you choose.
  • To track and deal with cron mails, you can direct them into a ticket system.

The system you propose to log information into a "network aware" place sounds like syslog. syslog provides a simple method for creating logs, it normally manages files such as /var/log/messages. You can make basic customisations, such as choosing which files receive the log messages.

Syslog can be started in a network aware mode. For example, you can configure it so a slave can log to a master:

[root@slave ~]#  echo "hello world from slave" | logger -p local1.info

[root@master ~]# tail /var/log/myapp
Jun 29 13:07:01 192.168.1.2 logger: hello world from slave

For a Red Hat based distribution, an example configuration is as follows:

[root@slave ~]# cat /etc/syslog.conf | grep local1
local1.*                                                @192.168.1.3

[root@master ~]# cat /etc/sysconfig/syslog | grep SYSLOGD_OPTIONS
SYSLOGD_OPTIONS="-m 0 -r"

[root@master ~]# cat /etc/syslog.conf | grep local
local1.* /var/log/myapp

(The first config line redirects local1.* log notices to @192.168.1.3 ("master"). The second SYSLOGD_OPIONS line's -r flag turns on network support. Lastly, the third config line directs local1.* messages received on "master" into a file).

The syslog approach is better for only logging errors/information. Log files have less visibility than e-mail, so you probably won't look at the logs unless something has gone wrong.

If you choose to go the syslog style route, also consider syslog-ng: http://freshmeat.net/projects/syslog-ng/.

Of course, you can get the best of both techniques by using both. For example, syslog'ing both failures and successes, and just mailing for failures.


Solution 5:

I posted a similar answer to a question on StackOverflow(https://stackoverflow.com/questions/21025495/system-for-monitoring-cron-jobs-and-automated-tasks)

Cronitor (https://cronitor.io) was a tool I built exactly for this purpose. It basically boils down to being a tracking beacon that uses http requests as the pings.

However, one of the needs that the OP mentions in his comment is needing to be informed when a job starts taking too long to run.

I had this same need, and found that similar tools didn't easily support this type of monitoring. Cronitor solves this by allowing you to optionally trigger a begin event and an end event in order to keep track of duration.

Duration tracking was a must have for me because I had a cronjob that was scheduled every hour, but over time started taking over an hour to run. Hope you find it useful!