apache-vsl - VirtualHost-splitting log daemon for Apache
apache-vsl [ -c apache-vsl.conf ] [ -d ] [ -q ] [ -h ]
apache-vsl is a logging program, intended to be run from Apache. It is designed to be configurable, versatile, efficient and scalable.
apache-vsl is designed to serve multiple Apache VirtualHosts using only one logging daemon. This logging daemon is started and managed by Apache itself, requiring little maintenance. It uses strftime-like template strings to specify time-based formats of log filenames, automatically takes care of writing to the proper log file, maintains current and previous symlinks, and can run multiple trigger programs when a log change is performed.
apache-vsl is optimized for Apache installations with high traffic or many VirtualHosts. It keeps file handles loaded in memory between log lines, to efficiently handle high traffic VirtualHosts, but will also close file handles that have not been logged to in a (configurable) amount of time, to efficiently handle a large number of VirtualHosts.
apache-vsl is installed as an Apache-wide CustomLog declaration dependant on an environment variable. For example:
LogFormat "%v %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" vsl_combined CustomLog "|/usr/bin/apache-vsl -c /etc/apache-vsl.conf" vsl_combined env=vsl-enabled
The contents of /etc/apache-vsl.conf are described in CONFIGURATION FILE, but a suitable default would be:
<LogGroup _default_> LogFile "/var/log/apache/%{vsl:groupname}/access_log.%Y-%m" SymbolicLink "/var/log/apache/%{vsl:groupname}/access_log" </LogGroup>
Then, instead of specifying a CustomLog in the Apache <VirtualHost> block, you would use "SetEnv vsl-enabled yes". For example:
<VirtualHost *:80> ServerName www.example.com ServerAlias example.com DocumentRoot /srv/www/www.example.com/htdocs SetEnv vsl-enabled yes </VirtualHost>
This matches the CustomLog vsl_combined, which is a pipe to apache-vsl. The first argument of the LogFormat is the canonical ServerName value (so, in the above example, it would always be "www.example.com" even if you visited http://example.com/), while the rest is a standard combined CLF format. This is passed to apache-vsl, which interprets the first word as the log group name, and the rest as the actual log. The log group name is matched against the apache-vsl.conf file in <LogGroup> blocks, looking for the specific entry, or falling back to "_default_" if no match is found. The log is then written to the file parsed by the apache-vsl.conf LogFile format. In this case, assuming the month to be January 2012, the log would be written to:
/var/log/apache/www.example.com/access_log.2012-01
and a symlink to that file would be created as:
/var/log/apache/www.example.com/access_log
Other features are vailable, such as PreviousLink (a symlink to the previous logfile) and LogChange (programs to run when the logs change).
Whenever the parsing of LogFile changes (i.e. a month change in this case), SymbolicLink and PreviousLink are updated, and LogChange programs are run.
Location of configuration file. See CONFIGURATION FILE for the format of the file. If not specified, the system default is /etc/apache-vsl.conf.
Debug mode. Events such as opening or closing log files are logged to STDERR. Due to configurable timeouts, the overall debug log traffic is fairly low, and is recommended when first deploying an apache-vsl installation.
Quiet mode. Events such as startup notification and signals received, normally logged to STDERR, will be surpressed. Errors will still be sent to STDERR during quiet mode.
Displays a help synopsis and exits.
The apache-vsl configuration file is in an Apache-style format. Variable and block names are case sensitive. Currently, no global-level options are available, and all configuration is done inside <LogGroup> blocks. However, like Apache, Include statements may be given to include other files, and may point to individual files, or directories or file globs.
The configuration file may be reloaded by either reloading/restarting Apache itself, which will stop and then start apache-vsl, or SIGUSR1 may be sent to the running apache-vsl process to force a configuration file reload.
In the event the configuration file cannot be parsed, apache-vsl will still start, but will not log. An error will be logged to STDERR, which will be sent to the global Apache ErrorLog. This behavior is to prevent a restart thrash within Apache, as Apache will automatically restart any pipe that stops within a logging process.
<LogGroup> blocks require an argument, with the argument being the log group name being passed to apache-vsl, or "_default_" to match any unnamed groups. Specifically named group blocks do not inherit from _default_, so, for example, if Timeout is set to 60 on _default_ but not specified in a specific group's block, it instead assumes the built-in default of 300.
The log file to be written to. This file will be parsed by strftime, and can use any percent-encoded variable available. Additionally, it recognizes %{vsl:groupname}, which is replaced with the group name.
If this option is not specified, no logs will be written.
The current symlink, which always links to the file valued by LogFile. It, like PreviousLink, will replace %{vsl:groupname} with the group name, but strftime variables will not be processed.
Whenever a rollover is detected (either by remembering the previous logfile in-memory, or comparing SymbolicLink to the currently computer LogFile), SymbolicLink and PreviousLink are updated.
If this option is not specified, the ability to detect rollovers will be reduced. apache-vsl will do its best to remember the previous file it had written to, but if SymbolicLink is not being created, and apache-vsl is not running during a rollover (for example, if log files are split along months and apache-vsl is not running between the end of the previous month and the beginning of the current month), apache-vsl will not be able to recognize if a rollover has occurred.
A symlink to the previous log, updated whenever a rollover occurs. It, like SymbolicLink, will replace %{vsl:groupname} with the group name, but strftime variables will not be processed.
When a rollover occurs, LogChange programs, either scripts or full compiled programs, will be run. This value must be the name of an actual executable; shell interpretation will not be performed. %{vsl:groupname}, if present in the program name, will be replaced with the group name. Several command-line parameters will be passed to the program:
"/path/to/program" "$group_name" "$old_logfile" "$new_logfile"
An example shell script to gzip compress the old logfile would be:
#!/bin/sh
[ -n "$1" -a -e "$2" ] || exit 1 nice gzip -9 "$2"
Multiple LogChange lines may be specified, however, the order in which they are executed is not defined. If you need to perform multiple steps in sequence, it is recommended you create a shell script that executes them in the desired sequence, and use that as a single LogChange.
In fact, very little can be assumed about the execution timing. LogChange programs are forked and then not monitored, so it is likely multiple LogChange programs, if specified, will be running parallel. The symbolic links may or may not be updated by the time the LogChange programs are running. The new logfile may or may not be created by this time as well.
LogChange programs are run the first time a logging event comes in after the rollover occurs. If the group is not logged to often, this could be seconds/minutes/hours/days after the rollover.
This option is completely optional.
The number of seconds of a log group's inactivity before the log's filehandle is closed. This is desirable on a server with many VirtualHosts, not all of which may be visited regularly, so all filehandles are not open all the time.
Whenever a log line is processed (on any group), apache-vsl will examine its cache of open filehandles and see when the last time a line had been written to each filehandle. If it is longer than the group's Timeout value, the filehandle is closed. If a log file is written to often, within the timeout threshold, it will never be closed.
If this option is not specified, a built-in default of 300 seconds is used. Again, remember log group blocks do not cascade, so if _default_ has a Timeout set and a specific group does not, the built-in default is used, not _default_'s.
apache-vsl's premise relies on there being a single pipe to apache-vsl, to be more efficient at logging. The downside is this CustomLog declaration is in Apache's root level, and there are no CustomLogs defined in the Apache VirtualHost itself. When this happens, Apache will process all root-level CustomLogs. If apache-vsl is the only CustomLog, this is not a problem. However, many distros will set a CustomLog as a fallback for when the user does not set a CustomLog in the VirtualHost. For example, in Debian, this is in /etc/apache2/apache2.conf:
# Define an access log for VirtualHosts that don't define their own logfile CustomLog /var/log/apache2/other_vhosts_access.log vhost_combined
This will also be triggered, along with apache-vsl. This is not bad per se, but is probably unwanted. Commenting out or removing this line will prevent logs from going to two places.
apache-vsl relies on a one-to-one relationship between log groups and log files. The result of two log groups resolving to the same log file are undefined and likely destructive.
However, there are many cases when you would want multiple Apache VirtualHosts to log to the same file. There are several ways you can accomplish this. You may elect to log to the main site, and use Apache redirects from the other sites to the main site. For example:
<VirtualHost *:80> ServerName www.example.com DocumentRoot /srv/www/www.example.com/htdocs SetEnv vsl-enabled yes </VirtualHost>
<VirtualHost *:80> ServerName example.com Redirect permanent / http://www.example.com/ </VirtualHost>
Or you may specify an arbitrary descriptor as the log group name, and use Apache environment variables to set it:
LogFormat "%{vsl-group}e %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" vsl_combined CustomLog "|/usr/bin/apache-vsl -c /etc/apache-vsl.conf" vsl_combined env=vsl-enabled-custom-group
<VirtualHost *:80> ServerName www.example-1.com DocumentRoot /srv/www/www.example-1.com/htdocs SetEnv vsl-enabled-custom-group yes SetEnv vsl-group custom-example </VirtualHost>
<VirtualHost *:80> ServerName www.example-2.com DocumentRoot /srv/www/www.example-2.com/htdocs SetEnv vsl-enabled-custom-group yes SetEnv vsl-group custom-example </VirtualHost>
Then in /etc/apache-vsl.conf:
<LogGroup _default_> LogFile "/var/log/apache/%{vsl:groupname}/access_log.%Y-%m" SymbolicLink "/var/log/apache/%{vsl:groupname}/access_log" </LogGroup>
<LogGroup custom-example> LogFile "/var/log/apache/www.example.com/access_log.%Y-%m" SymbolicLink "/var/log/apache/www.example.com/access_log" </LogGroup>
As you can see, while the log group name in apache-vsl is often equal to the canonical VirtualHost name, it is in fact completely arbitrary, and is just used for mapping to log group blocks.
A minor annoyance, but the default configuration file is /etc/apache-vsl.conf, to be as generic as possible. This interrupts tab completion in some cases with Apache itself. apache-vsl's author uses Debian-based systems frequently, which has Apache configuration files in /etc/apache2/, so he instead puts the apache-vsl configuration file in /etc/apache2/vsl.conf, and uses the -c parameter to pass this to apache-vsl.
None known, many assumed.
apache-vsl was written by Ryan Finnie <ryan@finnie.org>.