Controlling the Performance Impact of Microcode and Security Patches for CVE-2017-5754 CVE-2017-5715 and CVE-2017-5753 using Red Hat Enterprise Linux Tunables

Updated -

Red Hat Customer Portal Labs provides a Spectre And Meltdown Detector to help you detect if your systems are vulnerable to these CVEs.

The recent speculative execution CVEs address three potential attacks across a wide variety of processor architectures and platforms, each requiring slightly different fixes. In many cases, these fixes also require microcode updates from the hardware vendors.

Red Hat has made updated kernels available to address these security vulnerabilities. These patches are enabled by default (detailed below) because Red Hat prioritizes out of the box security. Speculative execution is a performance optimization technique. Thus, these updates (both kernel and microcode) may result in workload-specific performance degradation. Therefore, some customers who feel confident that their systems are well protected by other means (such as physical isolation), may wish to disable some or all of these kernel patches. If the end user elects to enable the patches in the interest of security, this article provides a mechanism to conduct performance characterizations with and without the fixes enabled.

Overview

Red Hat Customer Portal Labs provides a Spectre And Meltdown Detector to help you detect if your systems are vulnerable to these CVEs.

The recent speculative execution CVEs address three potential attacks across a wide variety of processor architectures and platforms, each requiring slightly different fixes. In many cases, these fixes also require microcode updates from the hardware vendors.

Red Hat has made updated kernels available to address these security vulnerabilities. These patches are enabled by default (detailed below) because Red Hat prioritizes out of the box security. Speculative execution is a performance optimization technique. Thus, these updates (both kernel and microcode) may result in workload-specific performance degradation. Therefore, some customers who feel confident that their systems are well protected by other means (such as physical isolation), may wish to disable some or all of these kernel patches. If the end user elects to enable the patches in the interest of security, this article provides a mechanism to conduct performance characterizations with and without the fixes enabled.

The security vulnerabilities described in these three CVEs may be found in modern microprocessors and operating systems on major hardware platforms including x86 (Intel and AMD chipsets), System Z, Power, and ARM.

Retpoline Kernels

As of March 2018, on X86 cpus, Red Hat is using “Retpoline” code sequences for indirect branches in the kernel to isolate those branches from speculative execution. For Intel processors prior to Skylake, Retpolines are used instead of the ibrs feature for mitigation against Spectre variant 2. For Skylake, due to full CVE mitigation concerns, ibrs will still be used and Retpolines will be disabled.

A patched GCC compiler with Retpoline support is used for compiling the Retpoline patched kernel. A patched GCC compiler is also needed to compile kernel modules which should run on the kernel. SystemTap is one example that uses kernel modules to run code in kernel space, so it also needs the patched compiler.

Disabling the CVEs:

For Red Hat Enterprise Linux kernels on x86, three debugfs tunables control the behavior of the various patches in the updated kernel. These patches require updated microcode, which can be obtained from the hardware platform providers.

These three debugfs tunables can be enabled or disabled on the kernel command line at boot, or at runtime via debugfs controls. The tunables control Page Table Isolation (pti), Indirect Branch Restricted Speculation (ibrs), and retpolines (retp). Depending on the CPU type, Red Hat enables each of these features by default as needed to protect the architecture detected at boot.

For those wanting to disable the security mitigation for these CVEs to recover the performance loss, two options are available:

Persistently disable - Effective across a reboot

The first option is to disable them via the kernel command line by adding these flags, then reboot the kernel to have them take effect: There are several flags available to do this, noted below.

     spectre_v2=off nopti

Note: you can individually disable each parameter, for performance characterization it is not required that all be simultaneously disabled.

Runtime disable - Does not persist through a reboot

The second option is to disable them at runtime with the following three commands. The change is immediately active and does not require a reboot.

    # echo 0 > /sys/kernel/debug/x86/pti_enabled
    # echo 0 > /sys/kernel/debug/x86/ibpb_enabled
    # echo 0 > /sys/kernel/debug/x86/ibrs_enabled

Note this requires that the debugfs filesystem be mounted. In RHEL 7 the debugfs is mounted by default. In RHEL 6 you can mount it manually with

  mount -t debugfs nodev /sys/kernel/debug

Verifying changes

To verify the fixes for these CVEs are correctly disabled, cat the following three files to verify their values are all set to 0.

    # cat /sys/kernel/debug/x86/pti_enabled
    # cat /sys/kernel/debug/x86/ibpb_enabled
    # cat /sys/kernel/debug/x86/ibrs_enabled

Some applications may still see a small performance loss even with the above CVE flags disabled.

Details:

The rest of this article describes more specifics about each CVE variant.

  • CVE-2017-5753 (variant #1/Spectre) is a Bounds-checking exploit during branching. This issue is fixed with a kernel patch. Variant #1 protection is always enabled; it is not possible to disable the patches. Red Hat’s performance testing for variant #1 did not show any measurable impact.

  • CVE-2017-5715 (variant #2/Spectre) is an indirect branching poisoning attack that can lead to data leakage. This attack allows for a virtualized guest to read memory from the host system. This issue is corrected with microcode, along with kernel and virtualization updates to both guest and host virtualization software. This vulnerability requires both updated microcode and kernel patches. Variant #2 behavior is controlled by the ibrs tunable which work in conjunction with the microcode, and the retp tunable. The ibpb tunable is still visible, but now read-only and is set by the kernel.

  • CVE-2017-5754 (variant #3/Meltdown) is an exploit that uses speculative cache loading to allow a local attacker to be able to read the contents of memory. This issue is corrected with kernel patches. Variant #3 behavior is controlled by the pti tunable (nopti/pti_enabled).

As noted, installing the microcode update for your hardware, if provided by the hardware vendor, is necessary to protect against variant 2. Please contact your hardware vendor for microcode updates.

Page Table Isolation (pti)

"nopti"/pti_enabled controls the Kernel Page Table Isolation feature, which isolates kernel pagetables when running in userland. This feature addresses CVE-2017-5754, also called variant #3, or Meltdown.

Customers and vendors can disable the PTI feature by passing "nopti" to the kernel command line at boot, or dynamically with the runtime debugfs control below:

    # echo 0 > /sys/kernel/debug/x86/pti_enabled

Indirect Branch Restricted Speculation (ibrs)

"noibrs"/ibrs_enabled controls the IBRS feature in the SPEC_CTRL model-specific register (MSR) when SPEC_CTRL is present in cpuid (post microcode update). When ibrs_enabled is set to 1 (spectre_v2=ibrs) the kernel runs with indirect branch restricted speculation, which protects the kernel space from attacks (even from hyperthreading/simultaneous multi-threading attacks). When IBRS is set to 2 (spectre_v2=ibrs_always), both userland and kernel runs with indirect branch restricted speculation. This protects userspace from hyperthreading/simultaneous multi-threading attacks as well, and is also the default on certain old AMD processors (family 10h, 12h and 16h). This feature addresses CVE-2017-5715, variant #2.

When ibrs_enabled is set to 3, only userland runs with indirect branch restricted speculation. This can be used in combination with retpoline (spectre_v2=retpoline,ibrs_user) to provide similar security to ibrs_always with less performance overhead.

Customer and vendors can disable the ibrs implementation in microcode by passing "noibrs" to the kernel command line at boot, or dynamically with the debugfs control below:

    # echo 0 > /sys/kernel/debug/x86/ibrs_enabled

Indirect Branch Prediction Barriers (ibpb)

Note: The ibpb tuning knob is now read-only and will be set by the kernel if either ibrs or retp is set. As with ibrs, ibpb needs updated microcode in order to work (and be set) correctly.

Ibpb controls the IBPB feature in the PRED_CMD model-specific register (MSR) if either IBPB_SUPPORT or SPEC_CTRL is present in cpuid (post microcode update). When ibpb_enabled is set to 1, an IBPB barrier that flushes the contents of the indirect branch prediction is run across user mode or guest mode context switches to prevent user and guest mode from attacking other applications or virtual machines on the same host. In order to protect virtual machines from other virtual machines, ibpb_enabled=1 is needed even if ibrs_enabled is set to 2. This feature addresses CVE-2017-5715, variant #2.

Architectural Defaults

By default, the appropriate tunables that apply to an architecture will be enabled automatically at boot time, based upon the architecture detected.

Intel Defaults:

pti=1 ibrs=0 retp=1 ibpb=1-> fix variant#1 #2 #3 for pre-Skylake cpus
pti=1 ibrs=1 retp=0 ibpb=1-> fix variant#1 #2 #3 for Skylake cpus

pti=1 retp=1 ibrs=0 ibpb=0 -> fix variant#1 #3 (for older Intel systems with no microcode update available)

AMD Defaults:
Due to the differences in underlying hardware implementation, AMD X86 systems are not vulnerable to variant #3. The correct default values will be set on AMD hardware based on dynamic checks during the boot sequence.

pti=0 ibrs=0 ibpb=1 retp=1 -> fix variant #1 #2 if the microcode update is applied
pti=0 ibrs=2 ibpb=1 retp=1 -> fix variant #1 #2 on older processors that can disable indirect branch prediction without microcode updates

  • The microcode patch provided by the vendor must be applied in order for the tunables to be visible.

s390x

  • s390x is affected by Spectre (Variants 1 and 2), but not by Meltdown (Variant 3).
  • The mitigation for variant 1 (ppa15) is always active if the feature is available in microcode. It cannot be switched on/off dynamically, you can only disable instruction patching completely via the kernel parameter noaltinstr.
  • For variant 2, mitigation is done via the bpb feature. As for ppa15, it is always active if the feature is available in microcode. It cannot be switched on/off dynamically either, but there is a kernel parameter nobp enabled by default in RHEL which you can switch off via 'nobp=off' (the RHEL patch uses a 'big hammer' approach, which has quite some performance impact).

ppc

  • Work to obtain information on the tunables for variants 1 and 2 are still in progress. >Please note that these will require pacthes from the vendor as well.
  • Variant 3 is controlled by the below:

      /sys/devices/system/cpu/rfi_flush
    
  • It is enabled by default, and as now can be disabled only by the kernel command line parameters no_rfi_flush or nopti. Specifying wither of these will disable variant 3.

  • The dynamic approach to change /sys/devices/system/cpu/rfi_flush will be available in RHEL 7.5 or later. After this, this can be changed by running:
 echo 0 > /sys/kernel/debug/powerpc/rfi_flush
 or
 echo 1 > /sys/kernel/debug/powerpc/rfi_flush

Tuned for automation

Customers may control these settings by adding the above mentioned tuning commands to a customized tuned-adm profile via this method:

How to create a customized tuned profile

Note that these security fixes for variants #1 #2 #3 are enabled by default. Therefore creating a custom tuned profile is only required if the user intends to disable the security fixes.

Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.