Chapter 12. Monitoring system performance with perf

As a system administrator you can use the perf tool to collect and analyze performance data of your system.

12.1. Recording a performance profile in per-CPU mode

You can use perf record in per-CPU mode to sample and record performance data in both and user-space and the kernel-space simultaneously across all threads on a monitored CPU. By default, per-CPU mode monitors all online CPUs.

Prerequisites

  • The perf user space tool is installed. For more information, see Installing perf.

Procedure

  • Sample and record the performance data:

    # perf record -a command

    Replace command with the command you want to sample data during. If you do not specify a command, then perf record will sample data until you manually stop it by pressing Ctrl+C.

Additional resources

  • The perf-record(1) man page.

12.2. Capturing call graph data with perf record

You can configure the perf record tool so that it records which function is calling other functions in the performance profile. This helps to identify a bottleneck if several processes are calling the same function.

Prerequisites

  • The perf user space tool is installed. For more information, see Installing perf.

Procedure

  • Sample and record performance data with the --call-graph option:

    $ perf record --call-graph method command
    • Replace command with the command you want to sample data during. If you do not specify a command, then perf record will sample data until you manually stop it by pressing Ctrl+C.
    • Replace method with one of the following unwinding methods:

      fp
      Uses the frame pointer method. Depending on compiler optimization, such as with binaries built with the GCC option --fomit-frame-pointer, this may not be able to unwind the stack.
      dwarf
      Uses DWARF Call Frame Information to unwind the stack.
      lbr
      Uses the last branch record hardware on Intel processors.

Additional resources

  • The perf-record(1) man page.

12.3. Identifying busy CPUs with perf

When investigating performance issues on a system, you can use the perf tool to identify the busiest CPUs in order to focus your efforts.

12.3.1. Displaying which CPU events were counted on with perf stat

You can use perf stat to display which CPU events were counted on by disabling CPU count aggregation. You must count events in system-wide mode by using the -a flag in order to use this functionality.

Prerequisites

  • The perf user space tool is installed. For more information, see Installing perf.

Procedure

  • Count the events with CPU count aggregation disabled:

    # perf stat -a -A sleep seconds

    The previous example displays counts of a default set of common hardware and software events recorded over a time period of seconds seconds, as dictated by using the sleep command, over each individual CPU in ascending order, starting with CPU0. As such, it may be useful to specify an event such as cycles:

    # perf stat -a -A -e cycles sleep seconds

12.3.2. Displaying which CPU samples were taken on with perf report

The perf record command samples performance data and stores this data in a perf.data file which can be read with the perf report command. The perf record command always records which CPU samples were taken on. You can configure perf report to display this information.

Prerequisites

  • The perf user space tool is installed. For more information, see Installing perf.
  • There is a perf.data file created with perf record in the current directory. If the perf.data file was created with root access, you need to run perf report with root access too.

Procedure

  • Display the contents of the perf.data file for further analysis while sorting by CPU:

    # perf report --sort cpu
    • You can sort by CPU and command to display more detailed information about where CPU time is being spent:

      # perf report --sort cpu,comm

      This example will list commands from all monitored CPUs by total overhead in descending order of overhead usage and identify the CPU the command was executed on.

Additional resources

12.3.3. Displaying specific CPUs during profiling with perf top

You can configure perf top to display specific CPUs and their relative usage while profiling your system in real time.

Prerequisites

  • The perf user space tool is installed. For more information, see Installing perf.

Procedure

  • Start the perf top interface while sorting by CPU:

    # perf top --sort cpu

    This example will list CPUs and their respective overhead in descending order of overhead usage in real time.

    • You can sort by CPU and command for more detailed information of where CPU time is being spent:

      # perf top --sort cpu,comm

      This example will list commands by total overhead in descending order of overhead usage and identify the CPU the command was executed on in real time.

12.4. Monitoring specific CPUs with perf

You can configure the perf tool to monitor only specific CPUs of interest.

12.4.1. Monitoring specific CPUs with perf record and perf report

You can configure perf record to only sample specific CPUs of interest and analyze the generated perf.data file with perf report for further analysis.

Prerequisites

  • The perf user space tool is installed. For more information, see Installing perf.

Procedure

  1. Sample and record the performance data in the specific CPU’s, generating a perf.data file:

    • Using a comma separated list of CPUs:

      # perf record -C 0,1 sleep seconds

      The previous example samples and records data in CPUs 0 and 1 for a period of seconds seconds as dictated by the use of the sleep command.

    • Using a range of CPUs:

      # perf record -C 0-2 sleep seconds

      The previous example samples and records data in all CPUs from CPU 0 to 2 for a period of seconds seconds as dictated by the use of the sleep command.

  2. Display the contents of the perf.data file for further analysis:

    # perf report

    This example will display the contents of perf.data. If you are monitoring several CPUs and want to know which CPU data was sampled on, see Displaying which CPU samples were taken on with perf report.

12.4.2. Displaying specific CPUs during profiling with perf top

You can configure perf top to display specific CPUs and their relative usage while profiling your system in real time.

Prerequisites

  • The perf user space tool is installed. For more information, see Installing perf.

Procedure

  • Start the perf top interface while sorting by CPU:

    # perf top --sort cpu

    This example will list CPUs and their respective overhead in descending order of overhead usage in real time.

    • You can sort by CPU and command for more detailed information of where CPU time is being spent:

      # perf top --sort cpu,comm

      This example will list commands by total overhead in descending order of overhead usage and identify the CPU the command was executed on in real time.

12.5. Generating a perf.data file that is readable on a different device

You can use the perf tool to record performance data into a perf.data file to be analyzed on a different device.

Prerequisites

Procedure

  1. Capture performance data you are interested in investigating further:

    # perf record -a --call-graph fp sleep seconds

    This example would generate a perf.data over the entire system for a period of seconds seconds as dictated by the use of the sleep command. It would also capture call graph data using the frame pointer method.

  2. Generate an archive file containing debug symbols of the recorded data:

    # perf archive

Verification steps

  • Verify that the archive file has been generated in your current active directory:

    # ls perf.data*

    The output will display every file in your current directory that begins with perf.data. The archive file will be named either:

    perf.data.tar.gz

    or

    perf data.tar.bz2

Additional resources

12.6. Analyzing a perf.data file that was created on a different device

You can use the perf tool to analyze a perf.data file that was generated on a different device.

Prerequisites

  • The perf user space tool is installed. For more information, see Installing perf.
  • A perf.data file and associated archive file generated on a different device are present on the current device being used.

Procedure

  1. Copy both the perf.data file and the archive file into your current active directory.
  2. Extract the archive file into ~/.debug:

    # mkdir -p ~/.debug
    # tar xf perf.data.tar.bz2 -C ~/.debug
    Note

    The archive file might also be named perf.data.tar.gz.

  3. Open the perf.data file for further analysis:

    # perf report