Controlling the Performance Impact of Microcode and Security Patches for CVE-2017-5754 CVE-2017-5715 and CVE-2017-5753 using Red Hat Enterprise Linux Tunables

Updated -

Overview

Red Hat Customer Portal Labs provides a Spectre And Meltdown Detector to help you detect if your systems are vulnerable to these flaws.

The recent speculative execution CVEs address three potential attack vectors across a wide variety of processor architectures and platforms. Each platform requiring slightly different fixes. In many cases these fixes also require matching microcode updates provided by hardware vendors.

The security vulnerabilities described in these three CVEs may be found in modern microprocessors and operating systems on major hardware platforms including x86 (Intel and AMD chipsets), System Z, Power and ARM.

Red Hat has made updated kernels available to address these security vulnerabilities. These patches are enabled by default because Red Hat prioritizes out of the box security. Speculative execution is a performance optimization technique which these updates change (both kernel and microcode) and may result in workload-specific performance degradation.

Some customers who feel confident that their systems are well protected may wish to disable some or all of protection mechanisms. If the system administrator wishes elects to enable the protection mechanisms in the interest of security, this article provides a method to conduct performance characterizations with and without the fixes enabled.

Retpoline Kernels

For Red Hat Enterprise Linux versions up through RHEL-7.6, Red Hat uses “retpoline” code sequences for indirect branches in the kernel to isolate those branches from speculative execution. In those OS releases, for Intel processors prior to Skylake, retpolines are used instead of the ibrs feature for mitigation against Spectre variant 2. For Skylake, ibrs is used instead of retpolines.

With the upcoming Red Hat Enterprise Linux 7.7 release, it is planned that all new installations on all Intel processors up through and
including Skylake will default to use retpolines. However updates from versions from RHEL 7.6, and before, to RHEL-7.7, and later, will preserve the existing Spectre variant 2 mitigation methods that were in place before the upgrade.

Any updated system that is using ibrs can be switched from using ibrs to retpolines at any time simply by adding the "spectre_v2=retpoline" flag to the kernel boot command line.

A patched GCC compiler with Retpoline support is required for compiling the Retpoline patched kernel and third party modules. Any third party kernel module supplied prior to the update will require recompiling from source. SystemTap is one example that uses kernel modules to run code in kernel space, so it also needs the patched compiler. Please contact your software provider of the kernel module and request an update if the kernel module fails to work correctly.

Disabling the protection mechanisms:

For Red Hat Enterprise Linux kernels on x86, three debugfs tunables control the behaviour of the various patches in the updated kernel. These patches require updated microcode, which can be obtained from the hardware platform providers.

These debugfs tunables can be enabled or disabled on the kernel command line at boot or at runtime via debugfs controls. The tunables control Page Table Isolation (pti), Indirect Branch Restricted Speculation (ibrs), and retpolines (retp). Depending on the CPU type, Red Hat enables each of these features by default as needed to protect the architecture detected at boot.

For those wanting to disable the security mitigation for these protection mechanisms to recover lost performance, the changes can be set at run time or persistently across reboots.

Persistently disable - Effective across a reboot

The first option is to disable them via the kernel command line by adding these flags, then reboot the kernel to have them take effect: There are several flags available to do this, noted below.

spectre_v2=off nopti

# For Red Hat Enterprise Linux 8
nospectre_v2 nopti

Note: you can individually disable each parameter, for performance characterization it is not required that all be simultaneously disabled.

Red Hat recommends using 'grubby' to manage these changes. See the Systems Adminstrators guide section on grubby for how to use this tool effectively.

Runtime disable - Does not persist through a reboot

The second option is to disable them at runtime with the following three commands. The change is immediately active and does not require a reboot.

# echo 0 > /sys/kernel/debug/x86/pti_enabled
# echo 0 > /sys/kernel/debug/x86/retp_enabled
# echo 0 > /sys/kernel/debug/x86/ibrs_enabled

Note this requires that the debugfs filesystem be mounted. In addition, the above "retp_enabled" alteration is only available at runtime for RHEL7 systems. This tunable is read-only for RHEL 6 systems. In RHEL 7 the debugfs is mounted by default. In RHEL 6 you can mount it manually with the command:

# mount -t debugfs nodev /sys/kernel/debug

Verifying changes

To verify the fixes for these CVEs are correctly disabled, check the contents of the three files to verify their values are all set to 0.

# cat /sys/kernel/debug/x86/pti_enabled
# cat /sys/kernel/debug/x86/retp_enabled
# cat /sys/kernel/debug/x86/ibrs_enabled

Some applications may still see a small performance loss even with the above CVE flags disabled.

Details:

The rest of this article describes more specifics about each CVE variant.

  • CVE-2017-5753 (variant #1/Spectre) is a Bounds-checking exploit during branching. This issue is fixed with a kernel patch. Variant #1 protection is always enabled; it is not possible to disable the patches. Red Hat’s performance testing for variant #1 did not show any measurable impact.

  • CVE-2017-5715 (variant #2/Spectre) is an indirect branching poisoning attack that can lead to data leakage. This attack allows for a virtualized guest to read memory from the host system. This issue is corrected with microcode, along with kernel and virtualization updates to both guest and host virtualization software. This vulnerability requires both updated microcode and kernel patches. Variant #2 behavior is controlled by the ibrs tunable which work in conjunction with the microcode, or the retp tunable. The ibpb tunable is still visible, but now read-only and is set by the kernel.

  • CVE-2017-5754 (variant #3/Meltdown) is an exploit that uses speculative cache loading to allow a local attacker to be able to read the contents of memory. This issue is corrected with kernel patches. Variant #3 behavior is controlled by the pti tunable (nopti/pti_enabled).

As noted, installing the microcode update for your hardware, if provided by the hardware vendor, is necessary to protect against variant 2. Please contact your hardware vendor for microcode updates.

Page Table Isolation (pti)

"nopti"/pti_enabled controls the Kernel Page Table Isolation feature, which isolates kernel pagetables when running in userland. This feature addresses CVE-2017-5754, also called variant #3 (Meltdown)

Customers and vendors can disable the PTI feature by passing "nopti" to the kernel command line at boot, or dynamically with the runtime debugfs control below:

# echo 0 > /sys/kernel/debug/x86/pti_enabled

Indirect Branch Restricted Speculation (ibrs)

The "noibrs"/ibrs_enabled controls the IBRS feature in the SPEC_CTRL model-specific register (MSR) when SPEC_CTRL is present in cpuid (post microcode update). When ibrs_enabled is set to 1 (spectre_v2=ibrs) the kernel runs with indirect branch restricted speculation, which protects the kernel space from attacks (even from hyperthreading/simultaneous multi-threading attacks). When IBRS is set to 2 (spectre_v2=ibrs_always), both userland and kernel runs with indirect branch restricted speculation. This protects userspace from hyperthreading/simultaneous multi-threading attacks as well, and is also the default on certain old AMD processors (family 10h, 12h and 16h). This feature addresses CVE-2017-5715, variant #2.

When ibrs_enabled is set to 3, only userland runs with indirect branch restricted speculation. This can be used in combination with retpoline (spectre_v2=retpoline,ibrs_user) to provide similar security to ibrs_always with less performance overhead.

The ibrs implementation can be disabled in microcode by passing "noibrs" to the kernel command line at boot, or dynamically with the debugfs control below:

# echo 0 > /sys/kernel/debug/x86/ibrs_enabled

Indirect Branch Prediction Barriers (ibpb)

Note: The ibpb tuning knob is now read-only and will be set by the kernel if either ibrs or retp is set. As with ibrs, ibpb needs updated microcode in order to work (and be set) correctly.

Ibpb controls the IBPB feature in the PRED_CMD model-specific register (MSR) if either IBPB_SUPPORT or SPEC_CTRL is present in cpuid (post microcode update). When ibpb_enabled is set to 1, an IBPB barrier that flushes the contents of the indirect branch prediction is run across user mode or guest mode context switches to prevent user and guest mode from attacking other applications or virtual machines on the same host. In order to protect virtual machines from other virtual machines, ibpb_enabled=1 is needed even if ibrs_enabled is set to 2. This feature addresses CVE-2017-5715, variant #2.

Architectural Defaults

By default, the appropriate tunables that apply to an architecture will be enabled automatically at boot time, based upon the architecture detected. Per Intel's guidance, (https://software.intel.com/security-software-guidance/api-app/sites/default/files/Retpoline-A-Branch-Target-Injection-Mitigation.pdf), both retpolines and IBRS are mitigations for Spectre variant 2. Retpolines will likely have a reduced performance impact.

Red Hat Enterprise Linux defaults on Intel CPUs:

Listed below are configurations that the kernel would default to assuming no kernel command line parameters are provided.

For all pre-Skylake CPUs, and for Skylake with new Red Hat Enterprise Linux 7.7 installations and beyond:

pti=1 ibrs=0 retp=1 ibpb=1-> mitigates variant #2 #3

For Skylake CPUs for RHEL installations prior to RHEL-7.7:

pti=1 ibrs=1 retp=0 ibpb=1-> fix variant#1 #2 #3

For older Intel systems with no microcode update available:

pti=1 retp=1 ibrs=0 ibpb=0 -> fix variant#1 #3

Red Hat Enterprise Linux defaults on AMD CPUs:
Due to the differences in underlying hardware implementation, AMD X86 systems are not vulnerable to variant #3. The correct default values will be set on AMD hardware based on dynamic checks during the boot sequence.

pti=0 ibrs=0 ibpb=1 retp=1 -> fix variant #1 #2 if the microcode update is applied
pti=0 ibrs=2 ibpb=1 retp=1 -> fix variant #1 #2 on older processors that can disable indirect branch prediction without microcode updates

Note: A microcode patch provided by the vendor must be applied in order for the tunables to be visible.

s390x Defaults
s390x is affected by Spectre (Variants 1 and 2), but not by Meltdown (Variant 3).

The mitigation for variant 1 (ppa15) is always active if the feature is available in microcode. It cannot be switched on/off dynamically, you can only disable instruction patching completely via the kernel parameter noaltinstr.
For variant 2, mitigation is done via the bpb feature. As for ppa15, it is always active if the feature is available in microcode. It cannot be switched on/off dynamically either, but there is a kernel parameter nobp enabled by default which can be disabled with the nobp=off kernel parameter. (This patch uses a 'big hammer' approach, which has measurable performance impact).

IBM POWER Defaults
The mitigation for Spectre (Variants 1 and 2) are provided by the system firmware provided by the vendor. Currently there is no tunable available on Linux to disable these mitigations. The mitigation for variant 3 is provided by the Linux kernel, without depending on system firmware (although an optimized implementation is used in case system firmware provides support for it). It is enabled by default, and can be disabled on boot time, with the kernel command line parameters no_rfi_flush or nopti), or at run time, with either of the following tunables:

On initial zstream/errate updates:
/sys/devices/system/cpu/rfi_flush

The above tunable will change with upcoming zstream/errata updates to the following:
/sys/kernel/debug/powerpc/rfi_flush

Tuned for automation

Customers may control these settings by adding the above mentioned tuning commands to a customized tuned-adm profile via this method:

How to create a customized tuned profile

Note that these security fixes for variants #1 #2 #3 are enabled by default. Therefore creating a custom tuned profile is only required if the user intends to disable the security fixes.

Comments