Red Hat Training

A Red Hat training course is available for Red Hat Enterprise Linux

Chapter 4. Kernel features

This chapter explains the purpose and use of kernel features that enable many user space tools and includes resources for further investigation of those tools.

4.1. Control groups

4.1.1. What is a control group?

Note

Control Group Namespaces are a Technology Preview in Red Hat Enterprise Linux 7.5

Linux Control Groups (cgroups) enable limits on the use of system hardware, ensuring that an individual process running inside a cgroup only utilizes as much as has been allowed in the cgroups configuration.

Control Groups restrict the volume of usage on a resource that has been enabled by a namespace. For example, the network namespace allows a process to access a particular network card, the cgroup ensures that the process does not exceed 50% usage of that card, ensuring bandwidth is available for other processes.

Control Group Namespaces provide a virtualized view of individual cgroups through the /proc/self/ns/cgroup interface.

The purpose is to prevent leakage of privileged data from the global namespaces to the cgroup and to enable other features, such as container migration.

Because it is now much easier to associate a container with a single cgroup, containers have a much more coherent cgroup view, it also enables tasks inside the container to have a virtualized view of the cgroup it belongs to.

4.1.2. What is a namespace?

Namespaces are a kernel feature that allow a virtual view of isolated system resources. By isolating a process from system resources, you can specify and control what a process is able to interact with. Namespaces are an essential part of Control Groups.

4.1.3. Supported namespaces

The following namespaces are supported from Red Hat Enterprise Linux 7.5 and later

  • Mount

    • The mount namespace isolates file system mount points, enabling each process to have a distinct filesystem space within wich to operate.
  • UTS

    • Hostname and NIS domain name
  • IPC

    • System V IPC, POSIX message queues
  • PID

    • Process IDs
  • Network

    • Network devices, stacks, ports, etc.
  • User

    • User and group IDs
  • Control Groups

    • Isolates cgroups
Note

Usage of Control Groups is documented in the Resource Management Guide

4.2. Kernel source checker

The Linux Kernel Module Source Checker (ksc) is a tool to check for non whitelist symbols in a given kernel module. Red Hat Partners can also use the tool to request review of a symbol for whitelist inclusion, by filing a bug in Red Hat bugzilla database.

4.2.1. Usage

The tool accepts the path to a module with the "-k" option

# ksc -k e1000e.ko
Checking against architecture x86_64
Total symbol usage: 165	Total Non white list symbol usage: 74

# ksc -k /path/to/module

Output is saved in $HOME/ksc-result.txt. If review of the symbols for whitelist addition is requested, then the usage description for each non-whitelisted symbol must be added to the ksc-result.txt file. The request bug can then be filed by running ksc with the "-p" option.

Note

KSC currently does not support xz compression The ksc tool is unable to process the xz compression method and reports the following error:

Invalid architecture, (Only kernel object files are supported)

Until this limitation is resolved, system administrators need to manually uncompress any third party modules using xz compression, before running the ksc tool.

4.3. Direct access for files (DAX)

Direct Access for files, known as 'file system dax', or 'fs dax', enables applications to read and write data on a dax-capable storage device without using the page cache to buffer access to the device.

This functionality is available when using the 'ext4' or 'xfs' file system, and is enabled either by mounting the file system with -o dax or by adding dax to the options section for the mount entry in /etc/fstab.

Further information, including code examples can be found in the kernel-doc package and is stored at /usr/share/doc/kernel-doc-<version>/Documentation/filesystems/dax.txt where '<version>' is the corresponding kernel version number.

4.4. Memory protection keys for userspace (also known as PKU, or PKEYS)

Memory Protection Keys provide a mechanism for enforcing page-based protections, but without requiring modification of the page tables when an application changes protection domains. It works by dedicating 4 previously ignored bits in each page table entry to a "protection key", giving 16 possible keys.

Memory Protection Keys are hardware feature of some Intel CPU chipsets. To determine if your processor supports this feature, check for the presence of pku in /proc/cpuinfo

$ grep pku /proc/cpuinfo

To support this feature, the CPUs provide a new user-accessible register (PKRU) with two separate bits (Access Disable and Write Disable) for each key. Two new instructions (RDPKRU and WRPKRU) exist for reading and writing to the new register.

Further documentation, including programming examples can be found in /usr/share/doc/kernel-doc-*/Documentation/x86/protection-keys.txt which is provided by the kernel-doc package.

4.5. Kernel adress space layout randomization

Kernel Adress Space Layout Randomization (KASLR) consists of two parts which work together to enhance the security of the Linux kernel:

  • kernel text KASLR
  • memory management KASLR

The physical address and virtual address of kernel text itself are randomized to a different position separately. The physical address of the kernel can be anywhere under 64TB, while the virtual address of the kernel is restricted between [0xffffffff80000000, 0xffffffffc0000000], the 1GB space.

Memory management KASLR has three sections whose starting address is randomized in a specific area. KASLR can thus prevent inserting and redirecting the execution of the kernel to a malicious code if this code relies on knowing where symbols of interest are located in the kernel address space.

Memory management KASLR sections are:

  • direct mapping section
  • vmalloc section
  • vmemmap section

KASLR code is now compiled into the Linux kernel, and it is enabled by default. To disable it explicitly, add the nokaslr kernel option to the kernel command line.

4.6. Advanced Error Reporting (AER)

4.6.1. What is AER

Advanced Error Reporting (AER) is a kernel feature that provides enhanced error reporting for Peripheral Component Interconnect Express (PCIe) devices. The AER kernel driver attaches root ports which support PCIe AER capability in order to:

  • Gather the comprehensive error information if errors occurred
  • Report error to the users
  • Perform error recovery actions

Example 4.1. Example AER output

Feb  5 15:41:33 hostname kernel: pcieport 10003:00:00.0: AER: Corrected error received: id=ae00
Feb  5 15:41:33 hostname kernel: pcieport 10003:00:00.0: AER: Multiple Corrected error received: id=ae00
Feb  5 15:41:33 hostname kernel: pcieport 10003:00:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=0000(Receiver ID)
Feb  5 15:41:33 hostname kernel: pcieport 10003:00:00.0:   device [8086:2030] error status/mask=000000c0/00002000
Feb  5 15:41:33 hostname kernel: pcieport 10003:00:00.0:    [ 6] Bad TLP
Feb  5 15:41:33 hostname kernel: pcieport 10003:00:00.0:    [ 7] Bad DLLP
Feb  5 15:41:33 hostname kernel: pcieport 10003:00:00.0: AER: Multiple Corrected error received: id=ae00
Feb  5 15:41:33 hostname kernel: pcieport 10003:00:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=0000(Receiver ID)
Feb  5 15:41:33 hostname kernel: pcieport 10003:00:00.0:   device [8086:2030] error status/mask=00000040/00002000

When AER captures an error, it sends an error message to the console. If the error is repairable, the console output is a warning.

4.6.2. Collecting and displaying AER messages

In order to collect and display AER messages, use the rasdaemon program.

Procedure

  1. Install the rasdaemon package.

    ~]# yum install rasdaemon
  2. Enable and start the rasdaemon service.

    ~]# systemctl enable --now rasdaemon
  3. Run the ras-mc-ctl command that displays a summary of the logged errors (the --summary option) or displays the errors stored at the error database (the --errors option).

    ~]# ras-mc-ctl --summary
    ~]# ras-mc-ctl --errors

Additional resources

  • For more information on the rasdaemon service, see the rasdaemon(8) manual page.
  • For more information on the ras-mc-ctl service, see the ras-mc-ctl(8) manual page.