Log File Tricks

Dealing with logfiles is always difficult. I've learned some tricks over the years and decided to document them here.

Managing Disk Space

Regardless of how frugal you are with logging, log files will eventually grow and become a burden. There are several approaches to this problem.

Echoing NULL. This solution is easy to carry out on a personal machine, but difficult on a production machine due to restricted access. It also doesn't work well for applications that want exclusive access to their logfiles. Basically, null is a special operating system devices that means "nothing" or "empty". The greater-than symbol copies the content of the null (which is nothing) into the logfile. So essentially the command replaces the contents of the file with nothing.

echo null > myApplication.log

Scheduled outage. Stop a running application, then move, compress or delete its log and restart the application. This solution is easy on a personal machine, but difficult on production machines due to limited access to the filesystem.

Planned Rotation. Setup logrotate. Its a standard Unix tool for dealing with logs. It has a config file /etc/logrotate.conf and a config directory /etc/logrotate.d for storing the rotation schedule of different logs. The logrotate man page and for instructions.

Syslog. syslog is a unix daemon process. It accepts log messages from any application on the network and writes (or doesn't write) them based on its configuration. The advantage of this approach is that it centralizes the handling of log files. syslog's configuration can be changed without affecting the systems that send messages to it. This offers a level of flexibility that is difficult to achieve with separate applications.

Monitoring Multiple Files

Sometimes its convenient to monitor multiple files at once. This is frequently done to see the full interaction that a user has with a system. For example: HTTP requests typically go through a firewall, then an HTTP server, then an application server and a database. There are several ways to monitor multiple logs.

tail

The simplest way to monitor multiple logs is with tail:

tail -f /var/log/mail.log /var/log/apache2/access.log

The -f parameter tells tail to 'follow' the logs, or to monitor them constantly until you tell it to stop. In this case we are monitoring two files, but we could have listed more.

==> auth.log <==
Feb 26 15:13:01 frankie CRON[12198]: pam_unix(cron:session): session opened for user root by (uid=0)
Feb 26 15:13:01 frankie CRON[12198]: pam_unix(cron:session): session closed for user root

==> apache2/access.log <==
111.111.111.11 - - [26/Feb/2009:15:13:20 -0500] "GET /html HTTP/1.1" 301 397 "-" "Mozilla/5.0 (Windows; U; Windows NT   5.1; en-US; rv:1.9.0.6) Gecko/2009011913 Firefox/3.0.6"
111.111.111.11 - - [26/Feb/2009:15:13:20 -0500] "GET /html/ HTTP/1.1" 200 1870 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.6) Gecko/2009011913 Firefox/3.0.6"

When running a command this way, tail (on Linux) will group statements from a single logfile together under a small heading for clarity, as shown above. Sometimes that results in small dalays--up to 15 seconds in my experience. And the grouping can make it hard to follow the flow of information between processes. Neither of these are problems in simple situations, but in more complex ones, it becomes unmanageable.

named pipes

The same kind of monitoring can be achieved, minus the grouping and delays, by using a named pipe.

$ mkfifo logwatcher
$ tail -f /var/log/apache2/access.log > logwatcher &
$ tail -f /var/log/auth.log > logwatcher &
$ cat < logwatcher

The first command creates a named pipe. The 2nd and 3rd commands send the output of a tail command to the named pipe, and keeps those processes running in the background (courtesy of the & at the end of the lines). The 4th command is where we monitor the output. Normally a cat command would terminate, but the tail commands sending input to the pipe won't terminate until we kill them, so the cat command remains open and lets us monitor the output. The output below shows that the logfiles are mixed in a mostly chronological order.

127.0.0.1 - - [26/Feb/2009:15:27:38 -0500] "GET /quant.js HTTP/1.0" 404 363 "http://www.linuxjournal.com/video/commandline-101-basic-directory-commands" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.6) Gecko/2009011913 Firefox/3.0.6"
127.0.0.1 - - [26/Feb/2009:15:27:40 -0500] "GET /pagead/show_ads.js HTTP/1.0" 404 383 "http://www.linuxjournal.com/video/commandline-101-basic-directory-commands" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.6) Gecko/2009011913 Firefox/3.0.6"
Feb 26 15:28:01 myhostname CRON[12531]: pam_unix(cron:session): session opened for user root by (uid=0)
Feb 26 15:28:01 myhostname CRON[12531]: pam_unix(cron:session): session closed for user root
127.0.0.1 - - [26/Feb/2009:15:28:08 -0500] "GET /html/ HTTP/1.0" 200 1870 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.6) Gecko/2009011913 Firefox/3.0.6"
Feb 26 15:29:01 frankie CRON[12440]: pam_unix(cron:session): session opened for user root by (uid=0)
Feb 26 15:29:01 frankie CRON[12440]: pam_unix(cron:session): session closed for user root