Tracing is a technique used to understand what is going on in a system in order to debug or monitor it. A tracer is the software used for tracing. Tracing can be used to debug a wide range of bugs that are otherwise extremely challenging. These include, for example, performance problems in complex parallel systems or real-time systems.
Tracing is similar to logging: it consists in recording events that happen in a system. However, compared to logging, it usually records much lower-level events that occur much more frequently. Tracers must therefore be optimized to handle a lot of data while having a small impact on the system. Tracers typically generate thousands of events per second. They frequently contain millions of events and have sizes from many megabytes to tens of gigabytes.
Traces may include events from the operating system kernel (IRQ handler entry/exit, system call entry/exit, scheduling activity, network activity, etc). They may also include events from any application.
The list of events of a trace may be read manually like a log file, for the maximum level of detail. However, trace analyzers and viewers are available to produce graphs and statistics from this enormous amount of data. These programs must be specially designed to handle quickly the enormous amount of data traces contain.
In the case of LTTng, low tracing overhead is achieved by instrumenting the Linux kernel with a set of custom patches. The same set of patches can be used for both Linux kernel as well as user space (i.e application) tracing.
Refer to the LTTng Project for more information on tracing and LTTng.
In the scope of the Linux Tools LTTng project, a trace is essentially a (very) large set of time-ordered LTTng events. The LTTng set of plugins accepts these traces and provides a number of standard views to analyze their contents either individually or through an experiment.