6.2. Monitoring and Diagnosing Performance Problems
Red Hat Enterprise Linux 7 provides a number of tools that are useful for monitoring system performance and diagnosing performance problems related to processors and their configuration. This section outlines the available tools and gives examples of how to use them to monitor and diagnose processor related performance issues.
Turbostat prints counter results at specified intervals to help administrators identify unexpected behavior in servers, such as excessive power usage, failure to enter deep sleep states, or system management interrupts (SMIs) being created unnecessarily.
The turbostat tool is part of the kernel-tools package. It is supported for use on systems with AMD64 and Intel® 64 processors. It requires root privileges to run, and processor support for invariant time stamp counters, and APERF and MPERF model specific registers.
For usage examples, see the man page:
$ man turbostat
This tool received substantial updates in the Red Hat Enterprise Linux 6 life cycle. While the default output remains compatible with the original tool written by Andi Kleen, supplying any options or parameters to numastat significantly changes the format of its output.
The numastat tool displays per-NUMA node memory statistics for processes and the operating system and shows administrators whether process memory is spread throughout a system or centralized on specific nodes.
Cross reference numastat output with per-processor top output to confirm that process threads are running on the same node from which process memory is allocated.
Numastat is provided by the numactl package. For further information about numastat output, see the man page:
$ man numastat
/proc/interrupts file lists the number of interrupts sent to each processor from a particular I/O device. It displays the interrupt request (IRQ) number, the number of that type of interrupt request handled by each processor in the system, the type of interrupt sent, and a comma-separated list of devices that respond to the listed interrupt request.
If a particular application or device is generating a large number of interrupt requests to be handled by a remote processor, its performance is likely to suffer. In this case, poor performance can be alleviated by having a processor on the same node as the application or device handle the interrupt requests. For details on how to assign interrupt handling to a specific processor, see Section 6.3.7, “Setting Interrupt Affinity on AMD64 and Intel 64”.
6.2.4. Cache and Memory Bandwidth Monitoring with pqos
The pqos utility, which is available from the intel-cmt-cat package, enables you to monitor CPU cache and memory bandwidth on recent Intel processors.
The pqos utility provides a cache and memory monitoring tool similar to the top utility. It monitors:
- The instructions per cycle (IPC).
- The count of last level cache MISSES.
- The size in kilobytes that the program executing in a given CPU occupies in the LLC.
- The bandwidth to local memory (MBL).
- The bandwidth to remote memory (MBR).
Use the following command to start the monitoring tool:
Items in the output are sorted by the highest LLC occupancy.
- For a general overview of the pqos utility and the related processor features, see Section 2.14, “pqos”.
- For an example of how using CAT can minimize the impact of a noisy neighbor virtual machine on the network performance of Data Plane Development Kit (DPDK), see the Increasing Platform Determinism with Platform Quality of Service for the Data Plane Development Kit Intel white paper.