Logging

Overview

Collectl supports 2 very basic data logging mechanisms. In the first case it will log the data as read from /proc to a file with the extension raw or raw.gz, depending on whether or not the perl module Compress::Zlib.pm has been installed. If not, one can always install compression at a later time and collectl will happily use it the next time it is started. One useful property of raw files is that one can play them back using different switches/options for display or generation of plottable files from them.

The second major form of logging is writing data to one or more tabularized, also known as plottable files, which have the extension tab for data associated with the core subsystems or one of several other files for the detail data associated with devices like cpus, disks, networks, etc.

The biggest benefit of raw files is they are very lightweight to create in that no additional processing is performed on the data. Since they contain the unaltered /proc data from which collectl derives its numbers to report, it is always possible to go back and look at the orginal data. In some cases, there is data in the raw file that was easier to collect than ignore and in these situations one can actually see more data than is normally available.

As their type implies, plottable files have their data in a form that is ready to be plotted with tools like gnuplot or immedately loadable into a spreadsheet like OpenOffice or Excel or any other tool that can read space-separated data. When generated by collectl while it is running, this data can be read while it is being generated making it possible to do real-time monitoring/display of it. For situations where a tool requires data be delimited by something other than spaces, one can change the data separator with --sep. In fact, for the case where a tool such as rrd requires the date be in UTC format, you can even change the timestamp format using --utc.

Logging Options

The following table provides a high-level view of the output options, the main 2 of which are the terminal or file. The first line of each of the 4 cells lists the switches and the second lists the format of the data:

 RawPlot
Terminalno switches
terminal
-P
plot
File-f
raw
-P -f
plot

For most users, this matrix is all you need to know. On the other hand if want to use collectl to feed data to other tools or perhaps log to both raw and plot files at the same time, read on...

Logging both raw and plottable data at the same time!

The main benefit in requesting collectl to write its data in plottable form is that data becomes available for immediate plotting without any post-processing required, the one expense being some additional processing overhead. However there are a few potential limits in doing so that should be understood.

First and foremost, once a plottable file has been created the original data from which it was created is lost forever. In most cases that is fine as there is really no need to go back to the original source. However, very often one collects summary data because that is what they are interested in, but then later decides they want to look at the details. This can be easily done by just replaying the raw file and requesting details be displayed or written to a plottable file. If a raw file had not been generated, this option would not be possible.

A second limitation with plottable data files is that one cannot easily examine the data by timeframes and when there are multiple data files involved, it is not easy to look at all the data together as time-oriented samples without plotting it. It is always possible to write a script that merges this data together, but that functionality is natively built into collectl when used in playback mode.

Finally, there are times when one might wish to go back and look at non-normalized data, for example if one has 3 processes created over a 10 second period collectl will report a rate of 0 process creations/second because it would round down and the only way to see what really happened is to play the data back with -on, which tells collectl not to normalize the data and will therefore tell you the value of the counter not its rate.

In most cases none of these restrictions should be a concern, but there may be occasions in which they are and that is where the --rawtoo switch comes in. When specified in conjunction with -P, collectl will generate raw data in addition to the plottable data, making it possible to go back to the source if/when necessary. The only real overhead is the amount of disk space required since the raw data is already sitting in a buffer and ready to be written. If the plottable files are being generated in uncompressed format, the size of the compressed raw file becomes even less significant.

We can now extend our table to include logging while communicating over a socket as well as logging to both raw and plot files at the same time noting if you don't inclue -f no local storage is involved.

 RawPlot
Terminalno switches
terminal
-P
plot
File-f
raw
-P -f [--rawtoo]
plot
Socket-A [-f]
terminal
-A -P [-f [--rawtoo]]
plot

S-expressions, the 3rd type of file

We finally come to a third type of output, intended primarily for feeding collectl data to other local programs, and that is the s-expression. S-expressions have been around for many years having their earliest roots in programming languages such as Lisp and Scheme, as described in the Wikipedia and offer a semi-structured mechanism for the representation of data. One such environment in which they are heavily used is supermon and by providing a mechanism for collectl to write s-expressions, one can more easily supply data to supermon or any other tools that might wish to consume it in close to real-time. The actual contents of the s-expressions will be driven by the subsystems for which data is being collected.

There are actually 2 types of values collectl can write into the s-expressions, the first simply being the raw data values as read from /proc which one can request by specifying --sexpr raw. With this form, the consumer of the data must perform the necessary calculations to compute the differences between samples and if a rate is desired, to divide by the number of seconds between samples. On the other hand if one simply wishes to look at the current rates for the various counters the second form, which is requested by specifying --sexpr rate.

When used without any logging switches (-f, -P and/or --rawtoo), the resultant s-expression is written to the terminal, more for consistency with the general output model than anything else.

The more typical use of S-expressions requires that collectl be in logging mode, that is one has specified a destination with -f. The directory associated with this destination then becomes the default location for the s-expression file. If one wishes to change that directory one can include the new destination with --sexpr. See the man page for details. Alternatively, one can send the s-expression over a socket by using -A instead of -f.

For consistency with the general logging model, one can choose to include additional logging switches so one may choose to send an s-expression over a socket and log to a local raw or plot file at the same time. In fact, one can even log to a local raw or plot file at the same time writing the s-expression to a local file. This allows us to extend our summary table to its complete form:

 RawPlotS-Expression
Terminalno switches
terminal
-P
plot
--sexpr
sexpr
File-f
raw
-P -f [--rawtoo]
plot
--sexpr -f [-P][--rawtoo]
sexpr
Socket-A [-f]
terminal*
-A -P [-f [--rawtoo]]
plot
--sexpr -A [-f [-P][--rawtoo]]
sexpr
* remember, logging does not apply to terminal based output

There are a couple of additional caveats you should be aware of. When you specify -A, --sexpr, -P and -f at the same time the s-expression goes over the socket and the plot data gets logged locally. Furthermore, this will also cause the s-expressdion data to be logged locally as well. In other words, when you specify -f with socket I/O, the type of data written over the socket is always locally logged.

If you are still confused, try experimenting with various combinations of switches and see which files get genereated.

One should also note that when run on an HP XC Cluster, the actual syntax of the s-expression generated has been extended to make it more easily consumable in that environment.

The overhead

So what is the overhead associated with all this logging? From the perspective of CPU load it can be quite minimal since in most cases the data is already in hand and all that needs to be done is to write it out to one or more additional files, something that is a fairly low-overhead operation on Linux systems. If this is really a concern, measure it yourself. It you want to see how much disk space involved just examine the sizes of the file(s) created during the performance tests and see for yourself.