how to write a script that only acts on new log entries

Solution 1:

You can make use of the already available Linux tools like tail, grep, and named pipes. First, create a named pipe (fifo) using:

$ mkfifo /tmp/myfifo

Second, create a simple script that will read from this fifo file. Here is a simple example:

#!/bin/bash
pipe=/tmp/myfifo

while true
do
    if read line <$pipe; then
        if [[ "$line" == 'quit' ]]; then
            break
        fi
        echo $line
    fi
done
echo "Reader exiting"

This script reads from the named pipe and prints the line to stdout until it gets the "quit" word. This is just an example that can be customized.

Third, use tail to read new lines that are appended to apache log file and redirect the output to the named pipe.

$ tail -n0 -F /var/log/apache2/access.log | grep some_text > /tmp/myfifo

The -F option means to follow the file by name which should make it immune to logrotate. So, it will follow always the same file name. The -n0 means to not get any old line. The grep is useful to direct only the relevant lines.

Using this solution, you don't need any cron job. Just run the script and the tail command shown above.

Solution 2:

Running your script via cron but using logtail or logtail2 to read the file will avoid reading the whole file every minute. Logtail keeps track of where it last read to and jumps to that point the next time you use it.

If you want to act on new log lines immediately rather than waiting up to 59 seconds between cron invocations, you will have to use tail -f or some equivalent.

Janne's and Khaled's answers both look to solve this problem well.


Solution 3:

If you have syslog-ng (probably rsyslogd will do too) as syslog-daemon, you can use that.

Just configure it to keep an eye on Apache log file, or alternatively configure Apache to send logs to syslog facility with CustomLog directive and logger.

Syslog-daemon will then use pattern matching and perform $foo if some match is found. For example, in syslog-ng you can set up a log file hook and filter it like this:

source apache_log { file("/var/log/apache2/access.log"); };
filter apache_match { match("GET /evilscript.php"); };

And then syslog-ng call external script

destination apache_logmatch_script { program("/usr/local/bin/apachematch.pl"); };

Finally put all of those together:

log { source(apache_log); filter(apache_match); destination apache_logmatch_script); };

If using this technique, syslog-ng will spawn your script background waiting for new stuff to appear. Because of this you need to modify your scripts to wait for input from STDIN; here's a short Perl example:

#!/usr/bin/perl -w

$|=1;
while (<>) {
     printf "Stuff happened, I got this entry: %s!\n", $_;
}

I'm not going deeper with my reply until I know you want to try this technique.