Chapter 29. OProfile
OProfile is a low overhead, system-wide performance monitoring tool. It uses the performance monitoring hardware on the processor to retrieve information about the kernel and executables on the system, such as when memory is referenced, the number of L2 cache requests, and the number of hardware interrupts received. On a Red Hat Enterprise Linux system, the oprofile package must be installed to use this tool.
Many processors include dedicated performance monitoring hardware. This hardware makes it possible to detect when certain events happen (such as the requested data not being in cache). The hardware normally takes the form of one or more counters that are incremented each time an event takes place. When the counter value, essentially rolls over, an interrupt is generated, making it possible to control the amount of detail (and therefore, overhead) produced by performance monitoring.
OProfile uses this hardware (or a timer-based substitute in cases where performance monitoring hardware is not present) to collect samples of performance-related data each time a counter generates an interrupt. These samples are periodically written out to disk; later, the data contained in these samples can then be used to generate reports on system-level and application-level performance.
OProfile is a useful tool, but be aware of some limitations when using it:
- Use of shared libraries — Samples for code in shared libraries are not attributed to the particular application unless the
--separate=libraryoption is used.
- Performance monitoring samples are inexact — When a performance monitoring register triggers a sample, the interrupt handling is not precise like a divide by zero exception. Due to the out-of-order execution of instructions by the processor, the sample may be recorded on a nearby instruction.
opreportdoes not associate samples for inline functions properly —
opreportuses a simple address range mechanism to determine which function an address is in. Inline function samples are not attributed to the inline function but rather to the function the inline function was inserted into.
- OProfile accumulates data from multiple runs — OProfile is a system-wide profiler and expects processes to start up and shut down multiple times. Thus, samples from multiple runs accumulate. Use the command
opcontrol --resetto clear out the samples from previous runs.
- Hardware performance counters do not work on guest virtual machines — Because the hardware performance counters are not available on virtual systems, you need to use the
timermode. Run the command
opcontrol --deinit, and then execute
modprobe oprofile timer=1to enable the
- Non-CPU-limited performance problems — OProfile is oriented to finding problems with CPU-limited processes. OProfile does not identify processes that are asleep because they are waiting on locks or for some other event to occur (for example an I/O device to finish an operation).
29.1. Overview of Tools
Table 29.1, “OProfile Commands” provides a brief overview of the tools provided with the oprofile package.
Table 29.1. OProfile Commands
| || |
Displays available events for the system's processor along with a brief description of each.
| || |
Converts sample database files from a foreign binary format to the native format for the system. Only use this option when analyzing a sample database from a different architecture.
| || Creates annotated source for an executable if the application was compiled with debugging symbols. See Section 29.5.4, “Using |
| || |
Configures what data is collected. See Section 29.2, “Configuring OProfile” for details.
| || |
Retrieves profile data. See Section 29.5.1, “Using
| || |
Runs as a daemon to periodically write sample data to disk.