6.3. Configuration Suggestions
6.3.1. Configuring Kernel Tick Time
nohz_fullparameter. On a 16 core system, specifying
nohz_full=1-15enables dynamic tickless behavior on cores 1 through 15, moving all timekeeping to the only unspecified core (core 0). This behavior can be enabled either temporarily at boot time, or persistently in the
/etc/default/grubfile. For persistent behavior, run the
grub2-mkconfig -o /boot/grub2/grub.cfgcommand to save your configuration.
- When the system boots, you must manually move rcu threads to the non-latency-sensitive core, in this case core 0.
# for i in `pgrep rcu[^c]` ; do taskset -pc 0 $i ; done
- Use the
isolcpusparameter on the kernel command line to isolate certain cores from user-space tasks.
- Optionally, set CPU affinity for the kernel's write-back bdi-flush threads to the housekeeping core:
echo 1 > /sys/bus/workqueue/devices/writeback/cpumask
# perf stat -C 1 -e irq_vectors:local_timer_entry taskset -c 1 stress -t 1 -c 1
while :; do d=1; done.
# perf stat -C 1 -e irq_vectors:local_timer_entry taskset -c 1 stress -t 1 -c 1 1000 irq_vectors:local_timer_entry
# perf stat -C 1 -e irq_vectors:local_timer_entry taskset -c 1 stress -t 1 -c 1 1 irq_vectors:local_timer_entry
6.3.2. Setting Hardware Performance Policy (x86_energy_perf_policy)
performancemode. It requires processor support, which is indicated by the presence of
CPUID.06H.ECX.bit3, and must be run with root privileges.
$ man x86_energy_perf_policy
6.3.3. Setting Process Affinity with taskset
$ man taskset
6.3.4. Managing NUMA Affinity with numactl
$ man numactl
libnumalibrary. This library offers a simple programming interface to the NUMA policy supported by the kernel, and can be used for more fine-grained tuning than the numactl application. For more information, see the man page:
$ man numa
6.3.5. Automatic NUMA Affinity Management with numad
numadis an automatic NUMA affinity management daemon. It monitors NUMA topology and resource usage within a system in order to dynamically improve NUMA resource allocation and management.
$ man numad
6.3.6. Tuning Scheduling Policy
220.127.116.11. Scheduling Policies
18.104.22.168.1. Static Priority Scheduling with SCHED_FIFO
SCHED_FIFO(also called static priority scheduling) is a realtime policy that defines a fixed priority for each thread. This policy allows administrators to improve event response time and reduce latency, and is recommended for time sensitive tasks that do not run for an extended period of time.
SCHED_FIFOis in use, the scheduler scans the list of all
SCHED_FIFOthreads in priority order and schedules the highest priority thread that is ready to run. The priority level of a
SCHED_FIFOthread can be any integer from 1 to 99, with 99 treated as the highest priority. Red Hat recommends starting at a low number and increasing priority only when you identify latency issues.
SCHED_FIFObandwidth to prevent realtime application programmers from initiating realtime tasks that monopolize the processor.
- This parameter defines the time period in microseconds that is considered to be one hundred percent of processor bandwidth. The default value is
1000000μs, or 1 second.
- This parameter defines the time period in microseconds that is devoted to running realtime threads. The default value is
950000μs, or 0.95 seconds.
22.214.171.124.2. Round Robin Priority Scheduling with SCHED_RR
SCHED_RRis a round-robin variant of
SCHED_FIFO. This policy is useful when multiple threads need to run at the same priority level.
SCHED_RRis a realtime policy that defines a fixed priority for each thread. The scheduler scans the list of all
SCHED_RRthreads in priority order and schedules the highest priority thread that is ready to run. However, unlike
SCHED_FIFO, threads that have the same priority are scheduled round-robin style within a certain time slice.
sched_rr_timeslice_mskernel parameter (
/proc/sys/kernel/sched_rr_timeslice_ms). The lowest value is 1 millisecond.
126.96.36.199.3. Normal Scheduling with SCHED_OTHER
SCHED_OTHERis the default scheduling policy in Red Hat Enterprise Linux 7. This policy uses the Completely Fair Scheduler (CFS) to allow fair processor access to all threads scheduled with this policy. This policy is most useful when there are a large number of threads or data throughput is a priority, as it allows more efficient scheduling of threads over time.
188.8.131.52. Isolating CPUs
isolcpusboot parameter. This prevents the scheduler from scheduling any user-space threads on this CPU.
isolcpusparameter, and does not currently achieve the performance gains associated with
isolcpus. See Section 6.3.8, “Configuring CPU, Thread, and Interrupt Affinity with Tuna” for more details about this tool.
6.3.7. Setting Interrupt Affinity on AMD64 and Intel 64
smp_affinity, which defines the processors that will handle the interrupt request. To improve application performance, assign interrupt affinity and process affinity to the same processor, or processors on the same core. This allows the specified interrupt and application threads to share cache lines.
Procedure 6.1. Balancing Interrupts Automatically
- If your BIOS exports its NUMA topology, the
irqbalanceservice can automatically serve interrupt requests on the node that is local to the hardware requesting service.For details on configuring
irqbalance, see Section A.1, “irqbalance”.
Procedure 6.2. Balancing Interrupts Manually
- Check which devices correspond to the interrupt requests that you want to configure.Starting with Red Hat Enterprise Linux 7.5, the system configures the optimal interrupt affinity for certain devices and their drivers automatically. You can no longer configure their affinity manually. This applies to the following devices:
- Devices using the
- NVMe PCI devices
- Find the hardware specification for your platform. Check if the chipset on your system supports distributing interrupts.
- If it does, you can configure interrupt delivery as described in the following steps.Additionally, check which algorithm your chipset uses to balance interrupts. Some BIOSes have options to configure interrupt delivery.
- If it does not, your chipset will always route all interrupts to a single, static CPU. You cannot configure which CPU is used.
- Check which Advanced Programmable Interrupt Controller (APIC) mode is in use on your system.Only non-physical flat mode (
flat) supports distributing interrupts to multiple CPUs. This mode is available only for systems that have up to 8 CPUs.
journalctl --dmesg | grep APICIn the command output:
If your system uses
- If your system uses a mode other than
flat, you can see a line similar to
Setting APIC routing to physical flat.
- If you can see no such message, your system uses
x2apicmode, you can disable it by adding the
nox2apicoption to the kernel command line in the bootloader configuration.
- Calculate the
smp_affinityvalue is stored as a hexadecimal bit mask representing all processors in the system. Each bit configures a different CPU. The least significant bit is CPU 0.The default value of the mask is
f, meaning that an interrupt request can be handled on any processor in the system. Setting this value to
1means that only processor 0 can handle the interrupt.
Procedure 6.3. Calculating the Mask
On systems with more than 32 processors, you must delimit
- In binary, use the value
1for CPUs that will handle the interrupts.For example, to handle interrupts by CPU 0 and CPU 7, use
0000000010000001as the binary code:
Table 6.1. Binary Bits for CPUs
CPU 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Binary 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1
- Convert the binary code to hexadecimal.For example, to convert the binary code using Python:
smp_affinityvalues for discrete 32 bit groups. For example, if you want only the first 32 processors of a 64 processor system to service an interrupt request, use
- Set the
smp_affinitymask.The interrupt affinity value for a particular interrupt request is stored in the associated
/proc/irq/irq_number/smp_affinityfile.Write the calculated mask to the associated file:
# echo mask > /proc/irq/irq_number/smp_affinity
- On systems that support interrupt steering, modifying the
smp_affinityproperty of an interrupt request sets up the hardware so that the decision to service an interrupt with a particular processor is made at the hardware level with no intervention from the kernel.For more information about interrupt steering, see Chapter 9, Networking.