Chapter 4. Important changes to external kernel parameters

This chapter provides system administrators with a summary of significant changes in the kernel shipped with Red Hat Enterprise Linux 8.3. These changes could include for example added or updated proc entries, sysctl, and sysfs default values, boot parameters, kernel configuration options, or any noticeable behavior changes.

New kernel parameters

acpi_no_watchdog = [HW,ACPI,WDT]
This parameter enables to ignore the Advanced Configuration and Power Interface (ACPI) based watchdog interface (WDAT) and let the native driver control the watchdog device instead.
dfltcc = [HW,S390]

This parameter configures the zlib hardware support for IBM Z architectures.

Format: { on | off | def_only | inf_only | always }

The options are:

  • on (default) - IBM Z zlib hardware support for compression on level 1 and decompression
  • off - No IBM Z zlib hardware support
  • def_only - IBM Z zlib hardware support for the deflate algorithm only (compression on level 1)
  • inf_only - IBM Z zlib hardware support for the inflate algorithm only (decompression)
  • always - Similar as on, but ignores the selected compression level and always uses hardware support (used for debugging)
irqchip.gicv3_pseudo_nmi = [ARM64]

This parameter enables support for pseudo non-maskable interrupts (NMIs) in the kernel.

To use this parameter you need to build the kernel with the CONFIG_ARM64_PSEUDO_NMI configuration item.

panic_on_taint =

Bitmask for conditionally calling panic() in add_taint()

Format: <hex>[,nousertaint]

A hexadecimal bitmask which represents a set of TAINT flags that will cause the kernel to panic when the add_taint() system call is invoked with any of the flags in this set. The optional nousertaint switch prevents userspace-forced crashes by writing to the /proc/sys/kernel/tainted file any flagset that matches the bitmask in panic_on_taint.

For for more information see the upstream documentation.

prot_virt = [S390]

Format: <bool>

This parameter enables hosting of protected virtual machines which are isolated from the hypervisor if the hardware support is present.

rcutree.use_softirq = [KNL]

This parameter enables elimination of Tree-RCU softirq processing.

If you set this parameter to zero, it moves all RCU_SOFTIRQ processing to per-CPU rcuc kthreads. If you set rcutree.use_softirq to a non-zero value (default), RCU_SOFTIRQ is used by default. Specify rcutree.use_softirq=0 to use rcuc kthreads.

split_lock_detect = [X86]

This parameter enables the split lock detection. When enabled, and if hardware support is present, atomic instructions that access data across cache line boundaries will result in an alignment check exception.

The options are:

  • off - not enabled
  • warn - the kernel will emit rate limited warnings about applications that trigger the Alignment Check Exception (#AC). This mode is the default on CPUs that supports split lock detection.
  • fatal - the kernel will send Buss error (SIGBUS) signal to applications that trigger the #AC exception.

    If the #AC exception is hit while not executing in the user mode, the kernel will issue an oops error in either the warn or fatal mode.

srbds = [X86,INTEL]

This parameter controls the Special Register Buffer Data Sampling (SRBDS) mitigation.

Certain CPUs are vulnerable to a Microarchitectural Data Sampling (MDS)-like exploit which can leak bits from the random number generator.

By default, microcode mitigates this issue. However, the microcode fix can cause the RDRAND and RDSEED instructions to become much slower. Among other effects, this will result in reduced throughput from the urandom kernel random number source device.

To disable the microcode mitigation, set the following option:

  • off - Disable mitigation and remove performance impact to RDRAND and RDSEED
svm = [PPC]

Format: { on | off | y | n | 1 | 0 }

This parameter controls the use of the Protected Execution Facility on pSeries systems.

nopv = [X86,XEN,KVM,HYPER_V,VMWARE]

This parameter disables the PV optimizations which forces the guest to run as generic guest with no PV drivers.

Currently supported are XEN HVM, KVM, HYPER_V and VMWARE guests.

Updated kernel parameters

hugepagesz = [HW]

This parameter specifies a huge page size. Use this parameter in conjunction with the hugepages parameter to pre-allocate a number of huge pages of the specified size.

Specify the hugepagesz and hugepages parameters in pairs such as:

hugepagesz=2M hugepages=512

The hugepagesz parameter can only be specified once on the command line for a specific huge page size. Valid huge page sizes are architecture dependent.

hugepages = [HW]

This parameter specifies the number of huge pages to pre-allocate. This parameter typically follows the valid hugepagesz or default_hugepagesz parameter.

However, if hugepages is the first or the only HugeTLB command-line parameter, it implicitly specifies the number of huge pages of the default size to allocate. If the number of huge pages of the default size is implicitly specified, it can not be overwritten by the hugepagesz + hugepages parameter pair for the default size.

For example, on an architecture with 2M default huge page size:

hugepages=256 hugepagesz=2M hugepages=512

Settings from the example above results in allocation of 256 2M huge pages and a warning message that the hugepages=512 parameter was ignored. If hugepages is preceded by invalid hugepagesz, hugepages will be ignored.

default_hugepagesz = [HW]

This parameter specifies the default huge page size. You can specify default_hugepagesz only once on the command-line. Optionally, you can follow default_hugepagesz with the hugepages parameter to pre-allocate a specific number of huge pages of the default size. Also, you can implicitly specify the number of default-sized huge pages to pre-allocate.

For example, on an architecture with 2M default huge page size:

hugepages=256
default_hugepagesz=2M hugepages=256
hugepages=256 default_hugepagesz=2M

Settings from the example above all results in allocation of 256 2M huge pages. Valid default huge page size is architecture dependent.

efi = [EFI]

Format: { "old_map", "nochunk", "noruntime", "debug", "nosoftreserve" }

The options are:

  • old_map [X86-64] - Switch to the old ioremap-based EFI runtime services mapping. 32-bit still uses this one by default
  • nochunk - Disable reading files in "chunks" in the EFI boot stub, as chunking can cause problems with some firmware implementations
  • noruntime - Disable EFI runtime services support
  • debug - Enable miscellaneous debug output
  • nosoftreserve - The EFI_MEMORY_SP (Specific Purpose) attribute sometimes causes the kernel to reserve the memory range for a memory mapping driver to claim. Specify efi=nosoftreserve to disable this reservation and treat the memory by its base type (for example EFI_CONVENTIONAL_MEMORY / "System RAM").
intel_iommu = [DMAR]

Intel IOMMU driver Direct Memory Access Remapping (DMAR).

The added options are:

  • nobounce (Default off) - Disable bounce buffer for untrusted devices such as the Thunderbolt devices. This will treat the untrusted devices as the trusted ones. Hence this setting might expose security risks of direct memory access (DMA) attacks.
mem = nn[KMG] [KNL,BOOT]

This parameter forces the usage of a specific amount of memory.

The amount of memory to be used in cases as follows:

  1. For test.
  2. When the kernel is not able to see the whole system memory.
  3. Memory that lies after the mem boundary is excluded from the hypervisor, then assigned to KVM guests.

    [X86] Work as limiting max address. Use together with the memmap parameter to avoid physical address space collisions. Without memmap, Peripheral Component Interconnect (PCI) devices could be placed at addresses belonging to unused RAM.

    Note that this setting only takes effect during the boot time since in the case 3 above, the memory may need to be hot added after the boot if the system memory of hypervisor is not sufficient.

pci = [PCI]

Various Peripheral Component Interconnect (PCI) subsystem options.

Some options herein operate on a specific device or a set of devices (<pci_dev>). These are specified in one of the following formats:

[<domain>:]<bus>:<dev>.<func>[/<dev>.<func>]*
pci:<vendor>:<device>[:<subvendor>:<subdevice>]

Note that the first format specifies a PCI bus/device/function address which may change if new hardware is inserted, if motherboard firmware changes, or due to changes caused by other kernel parameters. If the domain is left unspecified, it is taken to be zero. Optionally, a path to a device through multiple device/function addresses can be specified after the base address (this is more robust against renumbering issues). The second format selects devices using IDs from the configuration space which may match multiple devices in the system.

The options are:

  • hpmmiosize - The fixed amount of bus space which is reserved for hotplug bridge’s Memory-mapped I/O (MMIO) window. The default size is 2 megabytes.
  • hpmmioprefsize - The fixed amount of bus space which is reserved for hotplug bridge’s MMIO_PREF window. The default size is 2 megabytes.
pcie_ports = [PCIE]

Peripheral Component Interconnect Express (PCIe) port services handling.

The options are:

  • native - Use native PCIe services (PME, AER, DPC, PCIe hotplug) even if the platform does not give the OS permission to use them. This setting may cause conflicts if the platform also tries to use these services.
  • dpc-native - Use native PCIe service for DPC only. This setting may cause conflicts if firmware uses AER or DPC.
  • compat - Disable native PCIe services (PME, AER, DPC, PCIe hotplug).
rcu_nocbs = [KNL]
The argument is a CPU list. The string "all" can be used to specify every CPU on the system.
usbcore.authorized_default = [USB]

The default USB device authorization.

The options are:

  • -1 (Default) - Authorized except for wireless USB
  • 0 - Not authorized
  • 1 - Authorized
  • 2 - Authorized if the device is connected to the internal port
usbcore.old_scheme_first = [USB]
This parameter enables to start with the old device initialization scheme. This setting applies only to low and full-speed devices (default 0 = off).
usbcore.quirks = [USB]

A list of quirk entries to augment the built-in USB core quirk list. The list entries are separated by commas. Each entry has the form VendorID:ProductID:Flags, for example quirks=0781:5580:bk,0a5c:5834:gij. The IDs are 4-digit hex numbers and Flags is a set of letters. Each letter will change the built-in quirk; setting it if it is clear and clearing it if it is set.

The added flags:

  • o - USB_QUIRK_HUB_SLOW_RESET, hub needs extra delay after resetting its port

New /proc/sys/fs parameters

protected_fifos

This parameter is based on the restrictions in the Openwall software and provides protection by allowing to avoid unintentional writes to an attacker-controlled FIFO where a program intended to create a regular file.

The options are:

  • 0 - Writing to FIFOs is unrestricted.
  • 1 - Does not allow the O_CREAT flag open on FIFOs that we do not own in world writable sticky directories unless they are owned by the owner of the directory.
  • 2 - Applies to group writable sticky directories.
protected_regular

This parameter is similar to the protected_fifos parameter, however it avoids writes to an attacker-controlled regular file where a program intended to create one.

The options are:

  • 0 - Writing to regular files is unrestricted.
  • 1 - Does not allow the O_CREAT flag open on regular files that we do not own in world writable sticky directories unless they are owned by the owner of the directory.
  • 2 - Applies to group writable sticky directories.