Chapter 3. Important Changes to External Kernel Parameters

This chapter provides system administrators with a summary of significant changes in the kernel shipped with Red Hat Enterprise Linux 7.7. These changes include added or updated proc entries, sysctl, and sysfs default values, boot parameters, kernel configuration options, or any noticeable behavior changes.

New kernel parameters

usbcore.quirks = [USB]

This parameter provides a list of quirk entries to augment the built-in usb core quirk list.

The entries are separated by commas. Each entry has the form VendorID:ProductID:Flags.

The IDs are 4-digit hex numbers and Flags is a set of letters. Each letter will change the built-in quirk; setting it if it is clear and clearing it if it is set. The letters have the following meanings:

  • a = USB_QUIRK_STRING_FETCH_255 (string descriptors must not be fetched using a 255-byte read);
  • b = USB_QUIRK_RESET_RESUME (device cannot resume correctly so reset it instead);
  • c = USB_QUIRK_NO_SET_INTF (device cannot handle Set-Interface requests);
  • d = USB_QUIRK_CONFIG_INTF_STRINGS (device cannot handle its Configuration or Interface strings);
  • e = USB_QUIRK_RESET (device cannot be reset (e.g morph devices), do not use reset);
  • f = USB_QUIRK_HONOR_BNUMINTERFACES (device has more interface descriptions than the bNumInterfaces count, and cannot handle talking to these interfaces);
  • g = USB_QUIRK_DELAY_INIT (device needs a pause during initialization, after we read the device descriptor);
  • h = USB_QUIRK_LINEAR_UFRAME_INTR_BINTERVAL (For high speed and super speed interrupt endpoints, the USB 2.0 and USB 3.0 spec require the interval in microframes (1 microframe = 125 microseconds) to be calculated as interval = 2 ^ (bInterval-1). Devices with this quirk report their bInterval as the result of this calculation instead of the exponent variable used in the calculation);
  • i = USB_QUIRK_DEVICE_QUALIFIER (device cannot handle device_qualifier descriptor requests);
  • j = USB_QUIRK_IGNORE_REMOTE_WAKEUP (device generates spurious wakeup, ignore remote wakeup capability);
  • k = USB_QUIRK_NO_LPM (device cannot handle Link Power Management);
  • l = USB_QUIRK_LINEAR_FRAME_INTR_BINTERVAL (Device reports its bInterval as linear frames instead of the USB 2.0 calculation);
  • m = USB_QUIRK_DISCONNECT_SUSPEND (Device needs to be disconnected before suspend to prevent spurious wakeup);
  • n = USB_QUIRK_DELAY_CTRL_MSG (Device needs a pause after every control message);

    The example entry:

    quirks=0781:5580:bk,0a5c:5834:gij
ppc_tm = [PPC]

Disables Hardware Transactional Memory.

Format: {"off"}

cgroup.memory = [KNL]

Passes options to the cgroup memory controller.

Format: <string>

nokmem — This option disables kernel memory accounting.

mds = [X86,INTEL]

Controls mitigation for the Micro-architectural Data Sampling (MDS) vulnerability.

Certain CPUs are vulnerable to an exploit against CPU internal buffers which can forward information to a disclosure gadget under certain conditions.

In vulnerable processors, the speculatively forwarded data can be used in a cache side channel attack, to access data to which the attacker does not have direct access.

The options are:

  • full - Enable MDS mitigation on vulnerable CPUs.
  • full,nosmt - Enable MDS mitigation and disable Simultaneous multithreading (SMT) on vulnerable CPUs.
  • off - Unconditionally disable MDS mitigation.

    Not specifying this option is equivalent to mds=full.

mitigations = [X86,PPC,S390]

Controls optional mitigations for CPU vulnerabilities. This is a set of curated, arch-independent options, each of which is an aggregation of existing arch-specific options.

The options are:

  • off - Disable all optional CPU mitigations. This improves system performance, but it may also expose users to several CPU vulnerabilities.

    Equivalent to:

    • nopti [X86,PPC]
    • nospectre_v1 [PPC]
    • nobp=0 [S390]
    • nospectre_v2 [X86,PPC,S390]
    • spec_store_bypass_disable=off [X86,PPC]
    • l1tf=off [X86]
    • mds=off [X86]
  • auto (default) - Mitigate all CPU vulnerabilities, but leave Simultaneous multithreading (SMT) enabled, even if it’s vulnerable. This is for users who do not want to be surprised by SMT getting disabled across kernel upgrades, or who have other ways of avoiding SMT-based attacks.

    Equivalent to:

    • (default behavior)
  • auto,nosmt - Mitigate all CPU vulnerabilities, disabling Simultaneous multithreading (SMT) if needed. This is for users who always want to be fully mitigated, even if it means losing SMT.

    Equivalent to:

    • l1tf=flush,nosmt [X86]
    • mds=full,nosmt [X86]
watchdog_thresh = [KNL]

Sets the hard lockup detector stall duration threshold in seconds.

The soft lockup detector threshold is set to twice the value.

A value of 0 disables both lockup detectors. Default is 10 seconds.

novmcoredd [KNL,KDUMP]

Disables device dump. The device dump allows drivers to append dump data to vmcore so you can collect driver specified debug info.

Drivers can append the data without any limit and this data is stored in memory, so this may cause significant memory stress.

Disabling device dump can help save memory but the driver debug data will be no longer available.

This parameter is only available when CONFIG_PROC_VMCORE_DEVICE_DUMP is set.

Updated kernel parameters

resource_alignment

Specifies alignment and device to reassign aligned memory resources.

Format:

  • [<order of align>@][<domain>:]<bus>:<slot>.<func>[; …​]
  • [<order of align>@]pci:<vendor>:<device>\[:<subvendor>:<subdevice>][; …​]

If <order of align> is not specified, PAGE_SIZE is used as alignment. PCI-PCI bridge can be specified, if resource windows need to be expanded.

irqaffinity = [SMP]

Sets the default irq affinity mask.

Format:

  • <cpu number>,…​,<cpu number>
  • <cpu number>-<cpu number>
  • drivers (must be a positive range in ascending order)
  • mixture <cpu number>,…​,<cpu number>-<cpu number>

    Drivers will use drivers' affinity masks for default interrupt assignment instead of placing them all on CPU0.

The options are:

  • auto (default) - Mitigate all CPU vulnerabilities, but leave Simultaneous multithreading (SMT) enabled, even if it is vulnerable. This is for users who do not want to be surprised by SMT getting disabled across kernel upgrades, or who have other ways of avoiding SMT-based attacks.

    Equivalent to: (default behavior)

  • auto,nosmt - Mitigate all CPU vulnerabilities, disabling Simultaneous multithreading (SMT) if needed. This is for users who always want to be fully mitigated, even if it means losing SMT.

    Equivalent to:

    • l1tf=flush,nosmt [X86]
    • mds=full,nosmt [X86]

New /proc/sys/net/core parameters

bpf_jit_kallsyms

If Berkeley Packet Filter Just in Time compiler is enabled, the compiled images are unknown addresses to the kernel. It means they neither show up in traces nor in the /proc/kallsyms file. This enables export of these addresses, which can be used for debugging/tracing. If the bpf_jit_harden parameter is enabled, this feature is disabled.

Possible values are:

0 – Disable Just in Time (JIT) kallsyms export (default value).

1 – Enable Just in Time (JIT) kallsyms export for privileged users only.

Updated /proc/sys/fs parameters

dentry-state

Dentries are dynamically allocated and deallocated.

From linux/include/linux/dcache.h:

struct dentry_stat_t dentry_stat {
        int nr_dentry;
        int nr_unused;
        int age_limit;         (age in seconds)
        int want_pages;        (pages requested by system)
        int nr_negative;       (# of unused negative dentries)
        int dummy;             (Reserved for future use)
};

The nr_dentry number shows the total number of dentries allocated (active + unused).

The nr_unused number shows the number of dentries that are not actively used, but are saved in the least recently used (LRU) list for future reuse.

The age_limit number is the age in seconds after which dcache entries can be reclaimed when memory is short and the want_pages number is nonzero when the shrink_dcache_pages() function has been called and the dcache is not pruned yet.

The nr_negative number shows the number of unused dentries that are also negative dentries which do not map to any files. Instead, they help speeding up rejection of non-existing files provided by the users.