NAME

VTad - A rule-based performance monitoring utility for Linux


REQUIREMENTS

Linux 2.1.x or greater, and Perl 5.003 or greater.


DOWNLOAD

Latest Release is 1.0b2

Download it at http://sf.net/projects/vtad


SYNOPSIS

In the wake of the May 1999 Mindcraft ( http://www.mindcraft.com ) benchmark report (see http://lwn.net/1999/features/MindCraft1.0.phtml ), I found myself wondering just how much effort Mindcraft had put into performance tuning Linux. I thought back to my own work Internet World presentations on web server tuning (see http://www.blakeley.com/resources ) and remembered that Linux kernel tuning information had been fairly thin on the ground, and what I had found had sometimes been outdated or overly cryptic.

Interested Linux advocates have done a fine job of bringing out performance information to address the Mindcraft benchmarks (see http://www.nl.linux.org/linuxperf and http://www.tunelinux.com ). But web sites fall out of date quickly, and are subject to bit rot as their maintainers lose interest.

Also, few system administrators will take the time to decipher complex explanations of bus speeds, kernel hacking, etc. I wanted a more active tool - a point-and-shoot perfomance advisor.

Enter VTad.

Such a tool would periodically check your system for performance problems, and even tell you how to cure them. It would be flexible enough to monitor different parameters for different kinds of servers - Apache, Samba, etc. If a clueless user came onto comp.infosystems.www.server.unix and asked ``How do I speed up my server?'' without any details, you could simply tell him ``Run VTad.''

The folks at Sun have distributed a similar tool for several years now. The SE Performance Toolkit (see http://www.sun.com/sun-on-net/performance ) is now in its third major release, and provides quite powerful rule-based performance analysis. But that is on Solaris, and is not open-source. And SE does not tell you how to fix problems - just what the problems are.

Anyway, I did not want to simply clone the SE Performance Toolkit, with its focus on internal kernel structures and its tendency to break with every new release of Solaris.

I wanted a flexible tool that relied on the /proc filesystem, which can be made backward-compatible between kernel releases at minor cost. I wanted rules that were easier to create and modify than the SE rules. And I wanted to do it in Perl, because I am stubborn that way.

By now some of you are saying ``But isn't Perl slow? Why write a performance tool in an interpreted language?'' On my LinuxPPC laptop, VTad takes less CPU time than WindowMaker 0.60 does - just a blip every 30 seconds. And the gain in flexibility is enormous - I wrote VTad in about three days of part-time coding, most of it spent on prototyping. The same project would have taken me a month in C, and probably the rulesets would not be as flexible.

Even in Perl, implementing a flexible rule-based tool turned out to be a little more complicated than I thought. For an example, look at the cpu line in /proc/stat:

cpu 13405 12347 0 34013

On the cpu line, each number is the number of system ticks spent on user, system, and idle time (the third number I did not look up, since I was only interested in idle time). To look at the idle time since boot, we can just divide the last number on the cpu line by the sum the numbers on that line. But we really want to know the average idle time since the last sample we took. So I needed rules that could look back through a history of samples, and rules that could do comparisons across a vector of data.

I also wanted to check the number of open files against the max number of file descriptors, so I needed a rule that could compare two numbers from the same vector. And sometimes I wanted to compare against constants, or report results as a percentage.

None of these options were difficult to code, but designing a ruleset that accommodates all the options and is easy to use is a little harder. Probably I did not get it right, and there will be revisions. I did design some failsafes in, so that the program quits with semi-intelligent errors when a rule does not seem to be written correctly.


SAMPLE RULESET

The included default ruleset covers some basic subsystem threshholds, and recommends some initial tuning.

By looking at /proc/sys/kernel/shmmax and comparing it to the first entry in /proc/meminfo/Mem (physical memory), we can see the shared memory limit as a percentage of physical memory. VTad recommends allowing at least 10% of physical memory to be shared, and prefers 25%. This helps Apache to cache files, if that feature is enabled.

The kernel tuning notes in my source distribution recommend that the kernel allow 3-4 times as many open inodes as open files. So we test /proc/sys/fs/file-max against /proc/sys/fs/inode-max, and warn the user if his settings are too far out of range.

Finally, the local port range can limit server performance in some circumstances. For example, proxy servers have to open connections to origin servers, and they need a unique port:ip tuple for each one. So we recommend that the user have capacity for 10,000-28,000 port numbers, and we check /proc/sys/net/ipv4/ip_local_port_range to find out.

Turning to the rules, there are simple threshholds for CPU idle time, free memory, and free swap. The free memory rule is a bit weird, since Linux likes to use pretty much all of the available memory for buffers and cache. But I thought a warning would be nice when free memory gets really low.

Next we monitor open inodes and open files to make sure that the system is not bumping up against the kernel limits on either one. And we watch the fork rate, since high fork rates can really hurt performance.

Finally, there are rules for monitoring the disk i/o rate and the IP packet rate. I am not wholly satisfied with these last two rules, and I'll explain why a little later. And after you have read the next section, you can design your own rules instead.


USAGE

Start with --help for usage information.

Usage:

--analyze # analysis mode, the default
--help # prints this message
--interval # sampling interval, in seconds
--noanalyze # display data for every sample
--recalibrate # ignore cached performance data
--ruleset # specify the path to a custom ruleset

By default, the program will read default-rules.pl and report on potential performance problems at 30-second intervals.

This script may only be run by a user who can read /proc (and any other files specified in the ruleset).


CREATING RULESETS

Anyone can create a ruleset and point VTad at it - just copy default-rules.pl to myrules, and start VTad with ./vtad.pl --ruleset myrules

A ruleset consists of two data structures: %rules and %recommendations (the latter is optional).

Recommendations, if present, are evaluated only at startup. Rules are evaluated for every date sample (default 30 seconds).

Each structure is a hash of rules, keyed by the file path (for example: /proc/stat). Multiline /proc entries have their line-label appended to the /proc path, so the Mem line in /proc/meminfo appears as /proc/meminfo/Mem.

Keys that do not start with / are interpreted as application hierarchy locations. VTad will check its cached application data for a match. If VTad does not find a match in cache, then it will search through the current running processes to try and match the first element, using the $_ variable in the per-process environment, then look for a match to the rest of the hierarchy.

For example, the key apache/conf/httpd.conf might cause VTad to find /usr/local/apache/bin/httpd among the current processes (or in the cache). Then VTad would look for conf/httpd.conf relative to each part of /usr/local/apache/bin/httpd. If Apache is installed normally, VTad will find /usr/local/apache/conf/httpd.conf, and will store /usr/local/apache for future reference.

Right now, VTad stops when it finds the first match for any application key. This may result in false matches (for example, if multiple versions of Apache are running). One workaround is to choose rule keys with some care.

If you need to erase the application cache, run VTad with the --recalibrate option. This should not be necessary for most users, since the VTad distribution does not include any cache data.

Once VTad has found the appropriate file for the rule key, the rule itself can be interpreted. Each rule is another hash, containing directives for threshhold comparison. By default, comparison will be done as if all directives were 0, except for window=1, so that the 0th entry in the key data will be compared with its previous value.

compare

takes an argument that is the index of the value to compare to index. For example, if the line is min current max, one might set index=1, compare=2.

compare can also contain a key name instead of a value; the 0th entry of that key will then be compared. Use this trick when you need to compare to a different key, or when you need to compare to the 0th key. DO NOT set compare=0.

Another compare trick is that negative numbers are interpreted as positive constant values. For example, to watch for a red threshhold of 1536 kB free memory, you could set up a rule with /proc/meminfo/MemFree, free=1, compare=-1, red=1535, amber=2047. The amber threshhold would be 2048 kB.

Finally, compare may be set to a user-definable subroutine. For a trivial example, setting compare to sub { return 1 } would always compare against 1, giving you another way to make constant comparisons.

More complex subroutines can perform arbitrary tasks. The user subroutine is passed the current value, whatever that is, as a list reference. It is up to the user code to process that data, or to ignore it and do something else entirely. The user subroutine is expected to return a single number for comparison with the red and amber threshholds.

window

specifies the span of samples to measure against, with 0 indicating a 1-sample window (average since boot). When window and ratio are used together, VTad measures the index as a percent of the entire data row - this is useful for measurements such as the /proc/stat cpu entry, which shows jiffies of CPU time spent on user, low-priority, system, and idle time. So window=1, ratio=1, index=3 will measure idle time as a percent of the total from sample to sample.

compare and window

are mutually exclusive.

ratio

means that each value should be measured as a percent of the total for this entry.

free

means that this rule measures the available resources, rather than the resources in use.

index

specifies the array index to watch.

amber

specifies the threshhold for an amber warning.

red

specifies the threshhold for a red alert.

description

specifies a text string to print on red/amber.

remedy

allows the rule to suggest a fix for the problem - for example, raising the file descriptor limit. The remedy is only displayed when the red threshhold has been crossed.


AUTHORS

Michael Blakeley <mike@blakeley.com>

is an independant performance analyst. He can be reached via http://www.blakeley.com


VERSIONS

Version 1.0b2, 19990620 by Michael Blakeley. Added dynamic history depth. Added application finding and location caching. Added user-definable code references for compare. Improved data parsing in sample(). Changed default ruleset from vtad_default.rules to default-rules.pl, so that emacs will default to Perl mode. Version 1.0b1, 19990606 by Michael Blakeley. Initial release.


BUGS AND LIMITATIONS

It is not possible to have multiple occurences of the same key in %rules. As a result, you can only make one comparison per /proc entry. This should not be a problem for most users, but eventually VTad should learn how to handle list references within rules.


FUTURE ENHANCEMENTS

It would also be nice to have a single canonical source for rulesets of various kinds, and even allow VTad to auto-update itself and its example scripts.

It would be even nicer to have an option to self-tune - follow the recommendations at startup, and implement remedies for any red conditions that pop up. This would have to be done with user-defined remedy subroutines - perhaps making remedy a list of strings and code references?

The app location finder needs to check the filesystem as well as /proc. Perhaps rpm -q as well? Or a Perl rpm module?

The disk rule is pretty shoddy - it simply assumes that io above a given number of operations per second is too much. Unfortunately Linux has little to offer in this area.

This release includes a kernel patch to add disk-utilization statistics to Linux 2.2.x and display them in /proc/partitions. I found the original patch in the development release of sard for Linux, at ftp://ftp.uk.linux.org/pub/linux/sct/fs/profiling/ - it seems to have been written by sct, whoever that is. The sard package did not include a readme, so I cannot credit him very well :-). Note that I did have to fix some filenames and one apparent typo in the patch a bit to make it work on my system.

With time perhaps this patch will make it into the mainstream - meanwhile you will need to apply it if you want future disk utilization rules to function correctly.

The new fields in /proc/partitions are after each partition name. They appear to be counts of read io operations, merged reads, sectors read, ticks spent queueing and servicing reads, write io operations, merges writes, sectors written, ticks spent queueing and servicing writes, pending io operations, total ticks spent in the io queue, and the total io ticks (queue and service).

We are interested in two metrics: average service time and average utilization. We can calculate utilization as the ratio between total io ticks in the io queue and total uptime (the sum of the four numbers on the cpu line of /proc/stat). Average service time is more complex, but we can get to it by totaling the reads, merged reads, writes, and merged writes, and then dividing to total io ticks (q+s) by that sum. These are both complex rules, and neither fits into the current rule plan. This will be fixed in a future release.

Also check /proc/cmdline to find root device name?

VTad needs a better way to measure network rates. Right now the default ruleset has pretty good threshholds for 10BaseT, but ideally this rule should know how much bandwidth you have, and warn you when you are filling up that pipe. It would also be useful to distinguish between packet rates and data rates.

How about a test to see if the binary is newer than the source config file? Example: Apache and its compile-time Configuration file.

Is there any way to check the running kernel for compile-time options?

How about checking file sizes? ``RED: Apache error log growing too fast''

Check for free disk space? Ignore cdroms, floppies.

Implement vmstat-like output format for --display.

Better documentation.


REFERENCES

proc(5)

Linux kernel source, especially Documentation/sysctl, Documentation/networking, and net/TUNABLE.

http://home.att.net/~jageorge/performance.html

http://www.tunelinux.com

http://www.kegel.com/mindcraft_redux.html

http://www.linux.com/tuneup/

http://us1.samba.org/samba/ftp/docs/textdocs/Speed.txt


Why VTad?

There is a Solaris performance tool called SE, aka RuleTool, but that was a boring name. It includes several examples, one of which is virtual adrian, named after Adrian Cockroft.

STR.


LICENSE

This program is distributed under the Perl Artistic License. You can find a copy of it in the tarball, or in your Perl distribution.

Since this release is more of a framework than a finished product, I look forward to your feedback.