Chapter 10. Known issues

This part describes known issues in Red Hat Enterprise Linux 8.5.

10.1. Installer and image creation

GUI installation might fail if an attempt to unregister using the CDN is made before the repository refresh is completed

Since RHEL 8.2, when registering your system and attaching subscriptions using the Content Delivery Network (CDN), a refresh of the repository metadata is started by the GUI installation program. The refresh process is not part of the registration and subscription process, and as a consequence, the Unregister button is enabled in the Connect to Red Hat window. Depending on the network connection, the refresh process might take more than a minute to complete. If you click the Unregister button before the refresh process is completed, the GUI installation might fail as the unregister process removes the CDN repository files and the certificates required by the installation program to communicate with the CDN.

To work around this problem, complete the following steps in the GUI installation after you have clicked the Register button in the Connect to Red Hat window:

  1. From the Connect to Red Hat window, click Done to return to the Installation Summary window.
  2. From the Installation Summary window, verify that the Installation Source and Software Selection status messages in italics are not displaying any processing information.
  3. When the Installation Source and Software Selection categories are ready, click Connect to Red Hat.
  4. Click the Unregister button.

After performing these steps, you can safely unregister the system during the GUI installation.

(BZ#1821192)

Registration fails for user accounts that belong to multiple organizations

Currently, when you attempt to register a system with a user account that belongs to multiple organizations, the registration process fails with the error message You must specify an organization for new units.

To work around this problem, you can either:

  • Use a different user account that does not belong to multiple organizations.
  • Use the Activation Key authentication method available in the Connect to Red Hat feature for GUI and Kickstart installations.
  • Skip the registration step in Connect to Red Hat and use Subscription Manager to register your system post-installation.

(BZ#1822880)

The USB CD-ROM drive is not available as an installation source in Anaconda

Installation fails when the USB CD-ROM drive is the source for it and the Kickstart ignoredisk --only-use= command is specified. In this case, Anaconda cannot find and use this source disk.

To work around this problem, use the harddrive --partition=sdX --dir=/ command to install from USB CD-ROM drive. As a result, the installation does not fail.

(BZ#1914955)

The auth and authconfig Kickstart commands require the AppStream repository

The authselect-compat package is required by the auth and authconfig Kickstart commands during installation. Without this package, the installation fails if auth or authconfig are used. However, by design, the authselect-compat package is only available in the AppStream repository.

To work around this problem, verify that the BaseOS and AppStream repositories are available to the installer or use the authselect Kickstart command during installation.

(BZ#1640697)

The reboot --kexec and inst.kexec commands do not provide a predictable system state

Performing a RHEL installation with the reboot --kexec Kickstart command or the inst.kexec kernel boot parameters do not provide the same predictable system state as a full reboot. As a consequence, switching to the installed system without rebooting can produce unpredictable results.

Note that the kexec feature is deprecated and will be removed in a future release of Red Hat Enterprise Linux.

(BZ#1697896)

Network access is not enabled by default in the installation program

Several installation features require network access, for example, registration of a system using the Content Delivery Network (CDN), NTP server support, and network installation sources. However, network access is not enabled by default, and as a result, these features cannot be used until network access is enabled.

To work around this problem, add ip=dhcp to boot options to enable network access when the installation starts. Optionally, passing a Kickstart file or a repository located on the network using boot options also resolves the problem. As a result, the network-based installation features can be used.

(BZ#1757877)

Hard drive partitioned installations with iso9660 filesystem fails

You cannot install RHEL on systems where the hard drive is partitioned with the iso9660 filesystem. This is due to the updated installation code that is set to ignore any hard disk containing a iso9660 file system partition. This happens even when RHEL is installed without using a DVD.

To workaround this problem, add the following script in the kickstart file to format the disc before the installation starts.

Note: Before performing the workaround, backup the data available on the disk. The wipefs command formats all the existing data from the disk.

%pre
wipefs -a /dev/sda
%end

As a result, installations work as expected without any errors.

(BZ#1929105)

IBM Power systems with HASH MMU mode fail to boot with memory allocation failures

IBM Power Systems with HASH memory allocation unit (MMU) mode support kdump up to a maximum of 192 cores. Consequently, the system fails to boot with memory allocation failures if kdump is enabled on more than 192 cores. This limitation is due to RMA memory allocations during early boot in HASH MMU mode. To work around this problem, use the Radix MMU mode with fadump enabled instead of using kdump.

(BZ#2028361)

Adding the same username in both blueprint and Kickstart files causes Edge image installation to fail

To install a RHEL for Edge image, users must create a blueprint to build a rhel-edge-container image and also create a Kickstart file to install the RHEL for Edge image. When a user adds the same username, password, and SSH key in both the blueprint and the Kickstart file, the RHEL for Edge image installation fails. Currently, there is no workaround.

(BZ#1951964)

The new osbuild-composer back end does not replicate the blueprint state from lorax-composer on upgrades

Image Builder users that are upgrading from the lorax-composer back end to the new osbuild-composer back end, blueprints can disappear. As a result, once the upgrade is complete, the blueprints do not display automatically. To work around this problem, perform the following steps.

Prerequisites

  • You have the composer-cli CLI utility installed.

Procedure

  1. Run the command to load the previous lorax-composer based blueprints into the new osbuild-composer back end:

    $ for blueprint in $(find /var/lib/lorax/composer/blueprints/git/workspace/master -name '*.toml'); do composer-cli blueprints push "${blueprint}"; done

As a result, the same blueprints are now available in osbuild-composer back end.

Additional resources

(BZ#1897383)

Unexpected SELinux policies on systems where Anaconda is running as an application

When Anaconda is running as an application on an already installed system (for example to perform another installation to an image file using the –image anaconda option), the system is not prohibited to modify the SELinux types and attributes during installation. As a consequence, certain elements of SELinux policy might change on the system where Anaconda is running. To work around this problem, do not run Anaconda on the production system and execute it in a temporary virtual machine. So that the SELinux policy on a production system is not modified. Running anaconda as part of the system installation process such as installing from boot.iso or dvd.iso is not affected by this issue.

(BZ#2050140)

10.2. Subscription management

syspurpose addons have no effect on the subscription-manager attach --auto output.

In Red Hat Enterprise Linux 8, four attributes of the syspurpose command-line tool have been added: role,usage, service_level_agreement and addons. Currently, only role, usage and service_level_agreement affect the output of running the subscription-manager attach --auto command. Users who attempt to set values to the addons argument will not observe any effect on the subscriptions that are auto-attached.

(BZ#1687900)

10.3. Software management

libdnf-devel upgrade fails if the CodeReady Linux Builder repository is not available on the system

The libdnf-devel package has been moved from the BaseOS to CodeReady Linux Builder repository. Consequently, upgrading libdnf-devel fails if the CodeReady Linux Builder repository is not available on the system.

To work around this problem, enable the CodeReady Linux Builder repository, or remove the libdnf-devel package prior to the upgrade.

(BZ#1960616)

cr_compress_file_with_stat() can cause a memory leak

The createrepo_c library has the API cr_compress_file_with_stat() function. This function is declared with char **dst as a second parameter. Depending on its other parameters, cr_compress_file_with_stat() either uses dst as an input parameter, or uses it to return an allocated string. This unpredictable behavior can cause a memory leak, because it does not inform the user when to free dst contents.

To work around this problem, a new API cr_compress_file_with_stat_v2 function has been added, which uses the dst parameter only as an input. It is declared as char *dst. This prevents memory leak.

Note that the cr_compress_file_with_stat_v2 function is temporary and will be present only in RHEL 8. Later, cr_compress_file_with_stat() will be fixed instead.

(BZ#1973588)

10.4. Shells and command-line tools

coreutils might report misleading EPERM error codes

GNU Core Utilities (coreutils) started using the statx() system call. If a seccomp filter returns an EPERM error code for unknown system calls, coreutils might consequently report misleading EPERM error codes because EPERM can not be distinguished from the actual Operation not permitted error returned by a working statx() syscall.

To work around this problem, update the seccomp filter to either permit the statx() syscall, or to return an ENOSYS error code for syscalls it does not know.

(BZ#2030661)

10.5. Infrastructure services

Postfix TLS fingerprint algorithm in the FIPS mode needs to be changed to SHA-256

By default in RHEL 8, postfix uses MD5 fingerprints with the TLS for backward compatibility. But in the FIPS mode, the MD5 hashing function is not available, which may cause TLS to incorrectly function in the default postfix configuration. To workaround this problem, the hashing function needs to be changed to SHA-256 in the postfix configuration file.

For more details, see the related Knowledgebase article Fix postfix TLS in the FIPS mode by switching to SHA-256 instead of MD5.

(BZ#1711885)

The brltty package is not multilib compatible

It is not possible to have both 32-bit and 64-bit versions of the brltty package installed. You can either install the 32-bit (brltty.i686) or the 64-bit (brltty.x86_64) version of the package. The 64-bit version is recommended.

(BZ#2008197)

10.6. Security

File permissions of /etc/passwd- are not aligned with the CIS RHEL 8 Benchmark 1.0.0

Because of an issue with the CIS Benchmark, the remediation of the SCAP rule that ensures permissions on the /etc/passwd- backup file configures permissions to 0644. However, the CIS Red Hat Enterprise Linux 8 Benchmark 1.0.0 requires file permissions 0600 for that file. As a consequence, the file permissions of /etc/passwd- are not aligned with the benchmark after remediation.

(BZ#1858866)

libselinux-python is available only through its module

The libselinux-python package contains only Python 2 bindings for developing SELinux applications and it is used for backward compatibility. For this reason, libselinux-python is no longer available in the default RHEL 8 repositories through the dnf install libselinux-python command.

To work around this problem, enable both the libselinux-python and python27 modules, and install the libselinux-python package and its dependencies with the following commands:

# dnf module enable libselinux-python
# dnf install libselinux-python

Alternatively, install libselinux-python using its install profile with a single command:

# dnf module install libselinux-python:2.8/common

As a result, you can install libselinux-python using the respective module.

(BZ#1666328)

udica processes UBI 8 containers only when started with --env container=podman

The Red Hat Universal Base Image 8 (UBI 8) containers set the container environment variable to the oci value instead of the podman value. This prevents the udica tool from analyzing a container JavaScript Object Notation (JSON) file.

To work around this problem, start a UBI 8 container using a podman command with the --env container=podman parameter. As a result, udica can generate an SELinux policy for a UBI 8 container only when you use the described workaround.

(BZ#1763210)

Negative effects of the default logging setup on performance

The default logging environment setup might consume 4 GB of memory or even more and adjustments of rate-limit values are complex when systemd-journald is running with rsyslog.

See the Negative effects of the RHEL default logging setup on performance and their mitigations Knowledgebase article for more information.

(JIRA:RHELPLAN-10431)

SELINUX=disabled in /etc/selinux/config does not work properly

Disabling SELinux using the SELINUX=disabled option in the /etc/selinux/config results in a process in which the kernel boots with SELinux enabled and switches to disabled mode later in the boot process. This might cause memory leaks.

To work around this problem, disable SELinux by adding the selinux=0 parameter to the kernel command line as described in the Changing SELinux modes at boot time section of the Using SELinux title if your scenario really requires to completely disable SELinux.

(JIRA:RHELPLAN-34199)

crypto-policies incorrectly allow Camellia ciphers

The RHEL 8 system-wide cryptographic policies should disable Camellia ciphers in all policy levels, as stated in the product documentation. However, the Kerberos protocol enables the ciphers by default.

To work around the problem, apply the NO-CAMELLIA subpolicy:

# update-crypto-policies --set DEFAULT:NO-CAMELLIA

In the previous command, replace DEFAULT with the cryptographic level name if you have switched from DEFAULT previously.

As a result, Camellia ciphers are correctly disallowed across all applications that use system-wide crypto policies only when you disable them through the workaround.

(BZ#1919155)

Using multiple labeled IPsec connections with IKEv2 do not work correctly

When Libreswan uses the IKEv2 protocol, security labels for IPsec do not work correctly for more than one connection. As a consequence, Libreswan using labeled IPsec can establish only the first connection, but cannot establish subsequent connections correctly. To use more than one connection, use the IKEv1 protocol.

(BZ#1934859)

Smart-card provisioning process through OpenSC pkcs15-init does not work properly

The file_caching option is enabled in the default OpenSC configuration, and the file caching functionality does not handle some commands from the pkcs15-init tool properly. Consequently, the smart-card provisioning process through OpenSC fails.

To work around the problem, add the following snippet to the /etc/opensc.conf file:

app pkcs15-init {
        framework pkcs15 {
                use_file_caching = false;
        }
}

The smart-card provisioning through pkcs15-init only works if you apply the previously described workaround.

(BZ#1947025)

Connections to servers with SHA-1 signatures do not work with GnuTLS

SHA-1 signatures in certificates are rejected by the GnuTLS secure communications library as insecure. Consequently, applications that use GnuTLS as a TLS backend cannot establish a TLS connection to peers that offer such certificates. This behavior is inconsistent with other system cryptographic libraries.

To work around this problem, upgrade the server to use certificates signed with SHA-256 or stronger hash, or switch to the LEGACY policy.

(BZ#1628553)

OpenSSL in FIPS mode accepts only specific D-H parameters

In FIPS mode, TLS clients that use OpenSSL return a bad dh value error and abort TLS connections to servers that use manually generated parameters. This is because OpenSSL, when configured to work in compliance with FIPS 140-2, works only with Diffie-Hellman parameters compliant to NIST SP 800-56A rev3 Appendix D (groups 14, 15, 16, 17, and 18 defined in RFC 3526 and with groups defined in RFC 7919). Also, servers that use OpenSSL ignore all other parameters and instead select known parameters of similar size. To work around this problem, use only the compliant groups.

(BZ#1810911)

IKE over TCP connections do not work on custom TCP ports

The tcp-remoteport Libreswan configuration option does not work properly. Consequently, an IKE over TCP connection cannot be established when a scenario requires specifying a non-default TCP port.

(BZ#1989050)

Conflict in SELinux Audit rules and SELinux boolean configurations

If the Audit rule list includes an Audit rule that contains a subj_* or obj_* field, and the SELinux boolean configuration changes, setting the SELinux booleans causes a deadlock. As a consequence, the system stops responding and requires a reboot to recover. To work around this problem, disable all Audit rules containing the subj_* or obj_* field, or temporarily disable such rules before changing SELinux booleans.

With the release of the RHSA-2021:2168 advisory, the kernel handles this situation properly and no longer deadlocks.

(BZ#1924230)

systemd cannot execute commands from arbitrary paths

The systemd service cannot execute commands from /home/user/bin arbitrary paths because the SELinux policy package does not include any such rule. Consequently, the custom services that are executed on non-system paths fail and eventually log the Access Vector Cache (AVC) denial audit messages when SELinux denied access. To work around this problem, do one of the following:

  • Execute the command using a shell script with the -c option. For example,

    bash -c command
  • Execute the command from a common path using /bin, /sbin, /usr/sbin, /usr/local/bin, and /usr/local/sbin common directories.

(BZ#1860443)

Certain sets of interdependent rules in SSG can fail

Remediation of SCAP Security Guide (SSG) rules in a benchmark can fail due to undefined ordering of rules and their dependencies. If two or more rules need to be executed in a particular order, for example, when one rule installs a component and another rule configures the same component, they can run in the wrong order and remediation reports an error. To work around this problem, run the remediation twice, and the second run fixes the dependent rules.

(BZ#1750755)

Installation with the Server with GUI or Workstation software selections and CIS security profile is not possible

The CIS security profile is not compatible with the Server with GUI and Workstation software selections. As a consequence, a RHEL 8 installation with the Server with GUI software selection and CIS profile is not possible. An attempted installation using the CIS profile and either of these software selections will generate the error message:

package xorg-x11-server-common has been added to the list of excluded packages, but it can't be removed from the current software selection without breaking the installation.

To work around the problem, do not use the CIS security profile with the Server with GUI or Workstation software selections.

(BZ#1843932)

Kickstart uses org_fedora_oscap instead of com_redhat_oscap in RHEL 8

The Kickstart references the Open Security Content Automation Protocol (OSCAP) Anaconda add-on as org_fedora_oscap instead of com_redhat_oscap which might cause confusion. That is done to preserve backward compatibility with Red Hat Enterprise Linux 7.

(BZ#1665082)

usbguard-notifier logs too many error messages to the Journal

The usbguard-notifier service does not have inter-process communication (IPC) permissions for connecting to the usbguard-daemon IPC interface. Consequently, usbguard-notifier fails to connect to the interface, and it writes a corresponding error message to the Journal. Because usbguard-notifier starts with the --wait option, which ensures that usbguard-notifier attempts to connect to the IPC interface each second after a connection failure, by default, the log contains an excessive amount of these messages soon.

To work around the problem, allow a user or a group under which usbguard-notifier is running to connect to the IPC interface. For example, the following error message contains the UID and GID values for the GNOME Display Manager (GDM):

IPC connection denied: uid=42 gid=42 pid=8382, where uid and gid 42 = gdm

To grant the missing permissions to the gdm user, use the usbguard command and restart the usbguard daemon:

# usbguard add-user gdm --group --devices listen
# systemctl restart usbguard

After granting the missing permissions, the error messages no longer appear in the log.

(BZ#2000000)

Certain rsyslog priority strings do not work correctly

Support for the GnuTLS priority string for imtcp that allows fine-grained control over encryption is not complete. Consequently, the following priority strings do not work properly in rsyslog:

NONE:+VERS-ALL:-VERS-TLS1.3:+MAC-ALL:+DHE-RSA:+AES-256-GCM:+SIGN-RSA-SHA384:+COMP-ALL:+GROUP-ALL

To work around this problem, use only correctly working priority strings:

NONE:+VERS-ALL:-VERS-TLS1.3:+MAC-ALL:+ECDHE-RSA:+AES-128-CBC:+SIGN-RSA-SHA1:+COMP-ALL:+GROUP-ALL

As a result, current configurations must be limited to the strings that work correctly.

(BZ#1679512)

10.7. Networking

The nm-cloud-setup service removes manually-configured secondary IP addresses from interfaces

Based on the information received from the cloud environment, the nm-cloud-setup service configures network interfaces. Disable nm-cloud-setup to manually configure interfaces. However, in certain cases, other services on the host can configure interfaces as well. For example, these services could add secondary IP addresses. To avoid that nm-cloud-setup removes secondary IP addresses:

  1. Stop and disable the nm-cloud-setup service and timer:

    # systemctl disable --now nm-cloud-setup.service nm-cloud-setup.timer
  2. Display the available connection profiles:

    # nmcli connection show
  3. Reactive the affected connection profiles:

    # nmcli connection up "<profile_name>"

As a result, the service no longer removes manually-configured secondary IP addresses from interfaces.

(BZ#2132754)

NetworkManager does not support activating bond and team ports in a specific order

NetworkManager activates interfaces alphabetically by interface names. However, if an interface appears later during the boot, for example, because the kernel needs more time to discover it, NetworkManager activates this interface later. NetworkManager does not support setting a priority on bond and team ports. Consequently, the order in which NetworkManager activates ports of these devices is not always predictable. To work around this problem, write a dispatcher script.

For an example of such a script, see the corresponding comment in the ticket.

(BZ#1920398)

Systems with the IPv6_rpfilter option enabled experience low network throughput

Systems with the IPv6_rpfilter option enabled in the firewalld.conf file currently experience suboptimal performance and low network throughput in high traffic scenarios, such as 100-Gbps links. To work around the problem, disable the IPv6_rpfilter option. To do so, add the following line in the /etc/firewalld/firewalld.conf file.

IPv6_rpfilter=no

As a result, the system performs better, but also has reduced security.

(BZ#1871860)

10.8. Kernel

Reloading an identical crash extension may cause segmentation faults

When you load a copy of an already loaded crash extension file, it might trigger a segmentation fault. Currently, the crash utility detects if an original file has been loaded. Consequently, due to two identical files co-existing in the crash utility, a namespace collision occurs, which triggers the crash utility to cause a segmentation fault.

You can work around the problem by loading the crash extension file only once. As a result, segmentation faults no longer occur in the described scenario.

(BZ#1906482)

vmcore capture fails after memory hot-plug or unplug operation

After performing the memory hot-plug or hot-unplug operation, the event comes after updating the device tree which contains memory layout information. Thereby the makedumpfile utility tries to access a non-existent physical address. The problem appears if all of the following conditions meet:

  • A little-endian variant of IBM Power System runs RHEL 8.
  • The kdump or fadump service is enabled on the system.

Consequently, the capture kernel fails to save vmcore if a kernel crash is triggered after the memory hot-plug or hot-unplug operation.

To work around this problem, restart the kdump service after hot-plug or hot-unplug:

# systemctl restart kdump.service

As a result, vmcore is successfully saved in the described scenario.

(BZ#1793389)

Debug kernel fails to boot in crash capture environment on RHEL 8

Due to the memory-intensive nature of the debug kernel, a problem occurs when the debug kernel is in use and a kernel panic is triggered. As a consequence, the debug kernel is not able to boot as the capture kernel, and a stack trace is generated instead. To work around this problem, increase the crash kernel memory as required. As a result, the debug kernel boots successfully in the crash capture environment.

(BZ#1659609)

Allocating crash kernel memory fails at boot time

On certain Ampere Altra systems, allocating the crash kernel memory during boot fails when the 32-bit region is disabled in BIOS settings. Consequently, the kdump service fails to start. This is caused by memory fragmentation in the region below 4 GB with no fragment being large enough to contain the crash kernel memory.

To work around this problem, enable the 32-bit memory region in BIOS as follows:

  1. Open the BIOS settings on your system.
  2. Open the Chipset menu.
  3. Under Memory Configuration, enable the Slave 32-bit option.

As a result, crash kernel memory allocation within the 32-bit region succeeds and the kdump service works as expected.

(BZ#1940674)

kdump fails on some KVM virtual machines using default crash kernel memory

On some KVM virtual machines kdump fails when using the default amount of memory for kdump to capture the kernel crash dump. Consequently, the crash kernel displays the following error:

/bin/sh: error while loading shared libraries: libtinfo.so.6: cannot open shared object file: No such file or directory

To workaround this problem, increase the crashkernel= option by a minimum of 32M to fit the size requirement for kdump. For example, the final value must be the sum of current value and 32M.

In the case of the crashkernel=auto parameter:

  1. Check the current memory size, and increase the size by 32M as follows:

    echo $(($(cat /sys/kernel/kexec_crash_size)/1048576+32))M
  2. Configure the kernel crashkernel parameter to crashkernel=x, where x is the increased size.

(BZ#2004000)

The QAT manager leaves no spare device for LKCF

The Intel® QuickAssist Technology (QAT) manager (qatmgr) is a user space process, which by default uses all QAT devices in the system. As a consequence, there are no QAT devices left for the Linux Kernel Cryptographic Framework (LKCF). There is no need to work around this situation, as this behavior is expected and a majority of users will use acceleration from the user space.

(BZ#1920086)

The kernel ACPI driver reports it has no access to a PCIe ECAM memory region

The Advanced Configuration and Power Interface (ACPI) table provided by firmware does not define a memory region on the PCI bus in the Current Resource Settings (_CRS) method for the PCI bus device. Consequently, the following warning message occurs during the system boot:

[    2.817152] acpi PNP0A08:00: [Firmware Bug]: ECAM area [mem 0x30000000-0x31ffffff] not reserved in ACPI namespace
[    2.827911] acpi PNP0A08:00: ECAM at [mem 0x30000000-0x31ffffff] for [bus 00-1f]

However, the kernel is still able to access the 0x30000000-0x31ffffff memory region, and can assign that memory region to the PCI Enhanced Configuration Access Mechanism (ECAM) properly. You can verify that PCI ECAM works correctly by accessing the PCIe configuration space over the 256 byte offset with the following output:

03:00.0 Non-Volatile memory controller: Sandisk Corp WD Black 2018/PC SN720 NVMe SSD (prog-if 02 [NVM Express])
 ...
        Capabilities: [900 v1] L1 PM Substates
                L1SubCap: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
                          PortCommonModeRestoreTime=255us PortTPowerOnTime=10us
                L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
                           T_CommonMode=0us LTR1.2_Threshold=0ns
                L1SubCtl2: T_PwrOn=10us

As a result, you can ignore the warning message.

For more information about the problem, see the "Firmware Bug: ECAM area mem 0x30000000-0x31ffffff not reserved in ACPI namespace" appears during system boot solution.

(BZ#1868526)

The tuned-adm profile powersave command causes the system to become unresponsive

Executing the tuned-adm profile powersave command leads to an unresponsive state of the Penguin Valkyrie 2000 2-socket systems with the older Thunderx (CN88xx) processors. Consequently, reboot the system to resume working. To work around this problem, avoid using the powersave profile if your system matches the mentioned specifications.

(BZ#1609288)

Using irqpoll causes vmcore generation failure

Due to an existing problem with the nvme driver on the 64-bit ARM architecture that run on the Amazon Web Services (AWS) cloud platforms, causes vmcore generation failure when you provide the irqpoll kernel command line parameter to the first kernel. Consequently, no vmcore file is dumped in the /var/crash/ directory upon a kernel crash. To work around this problem:

  1. Append irqpoll to KDUMP_COMMANDLINE_REMOVE in the /etc/sysconfig/kdump file.

    KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug quiet log_buf_len swiotlb"
  2. Remove irqpoll from KDUMP_COMMANDLINE_APPEND in the /etc/sysconfig/kdump file.

    KDUMP_COMMANDLINE_APPEND="irqpoll nr_cpus=1 reset_devices cgroup_disable=memory udev.children-max=2 panic=10 swiotlb=noforce novmcoredd"
  3. Restart the kdump service:

    systemctl restart kdump

As a result, the first kernel boots correctly and the vmcore file is expected to be captured upon the kernel crash.

Note that the kdump service can use a significant amount of crash kernel memory to dump the vmcore file. Ensure that the capture kernel has sufficient memory available for the kdump service.

For related information on this Known Issue, see the The irqpoll kernel command line parameter might cause vmcore generation failure article.

(BZ#1654962)

The HP NMI watchdog does not always generate a crash dump

In certain cases, the hpwdt driver for the HP NMI watchdog is not able to claim a non-maskable interrupt (NMI) generated by the HPE watchdog timer because the NMI was instead consumed by the perfmon driver.

The missing NMI is initiated by one of two conditions:

  1. The Generate NMI button on the Integrated Lights-Out (iLO) server management software. This button is triggered by a user.
  2. The hpwdt watchdog. The expiration by default sends an NMI to the server.

Both sequences typically occur when the system is unresponsive. Under normal circumstances, the NMI handler for both these situations calls the kernel panic() function and if configured, the kdump service generates a vmcore file.

Because of the missing NMI, however, kernel panic() is not called and vmcore is not collected.

In the first case (1.), if the system was unresponsive, it remains so. To work around this scenario, use the virtual Power button to reset or power cycle the server.

In the second case (2.), the missing NMI is followed 9 seconds later by a reset from the Automated System Recovery (ASR).

The HPE Gen9 Server line experiences this problem in single-digit percentages. The Gen10 at an even smaller frequency.

(BZ#1602962)

Connections fail when attaching a virtual function to virtual machine

Pensando network cards that use the ionic device driver silently accept VLAN tag configuration requests and attempt configuring network connections while attaching network virtual functions (VF) to a virtual machine (VM). Such network connections fail as this feature is not yet supported by the card’s firmware.

(BZ#1930576)

The OPEN MPI library may trigger run-time failures with default PML

In OPEN Message Passing Interface (OPEN MPI) implementation 4.0.x series, Unified Communication X (UCX) is the default point-to-point communicator (PML). The later versions of OPEN MPI 4.0.x series deprecated openib Byte Transfer Layer (BTL).

However, OPEN MPI, when run over a homogeneous cluster (same hardware and software configuration), UCX still uses openib BTL for MPI one-sided operations. As a consequence, this may trigger execution errors. To work around this problem:

  • Run the mpirun command using following parameters:
-mca btl openib -mca pml ucx -x UCX_NET_DEVICES=mlx5_ib0

where,

  • The -mca btl openib parameter disables openib BTL
  • The -mca pml ucx parameter configures OPEN MPI to use ucx PML.
  • The x UCX_NET_DEVICES= parameter restricts UCX to use the specified devices

The OPEN MPI, when run over a heterogeneous cluster (different hardware and software configuration), it uses UCX as the default PML. As a consequence, this may cause the OPEN MPI jobs to run with erratic performance, unresponsive behavior, or crash failures. To work around this problem, set the UCX priority as:

  • Run the mpirun command using following parameters:
-mca pml_ucx_priority 5

As a result, the OPEN MPI library is able to choose an alternative available transport layer over UCX.

(BZ#1866402)

The Solarflare fails to create maximum number of virtual functions (VFs)

The Solarflare NICs fail to create a maximum number of VFs due to insufficient resources. You can check the maximum number of VFs that a PCIe device can create in the /sys/bus/pci/devices/PCI_ID/sriov_totalvfs file. To workaround this problem, you can either adjust the number of VFs or the VF MSI interrupt value to a lower value, either from Solarflare Boot Manager on startup, or using Solarflare sfboot utility. The default VF MSI interrupt value is 8.

  • To adjust the VF MSI interrupt value using sfboot:
# sfboot vf-msix-limit=2
Note

Adjusting VF MSI interrupt value affects the VF performance.

For more information about parameters to be adjusted accordingly, see the Solarflare Server Adapter user guide.

(BZ#1971506)

10.9. Hardware enablement

The default 7 4 1 7 printk value sometimes causes temporary system unresponsiveness

The default 7 4 1 7 printk value allows for better debugging of the kernel activity. However, when coupled with a serial console, this printk setting can cause intense I/O bursts that can lead to a RHEL system becoming temporarily unresponsive. To work around this problem, we have added a new optimize-serial-console TuneD profile, which reduces the default printk value to 4 4 1 7. Users can instrument their system as follows:

# tuned-adm profile throughput-performance optimize-serial-console

Having a lower printk value persistent across a reboot reduces the likelihood of system hangs.

Note that this setting change comes at the expense of losing the extra debugging information.

(JIRA:RHELPLAN-28940)

10.10. File systems and storage

Limitations of LVM writecache

The writecache LVM caching method has the following limitations, which are not present in the cache method:

  • You cannot name a writecache logical volume when using pvmove commands.
  • You cannot use logical volumes with writecache in combination with thin pools or VDO.

The following limitation also applies to the cache method:

  • You cannot resize a logical volume while cache or writecache is attached to it.

(JIRA:RHELPLAN-27987, BZ#1798631, BZ#1808012)

LVM mirror devices that store a LUKS volume sometimes become unresponsive

Mirrored LVM devices with a segment type of mirror that store a LUKS volume might become unresponsive under certain conditions. The unresponsive devices reject all I/O operations.

To work around the issue, Red Hat recommends that you use LVM RAID 1 devices with a segment type of raid1 instead of mirror if you need to stack LUKS volumes on top of resilient software-defined storage.

The raid1 segment type is the default RAID configuration type and replaces mirror as the recommended solution.

To convert mirror devices to raid, see Converting a mirrored LVM device to a RAID1 logical volume.

(BZ#1730502)

The /boot file system cannot be placed on LVM

You cannot place the /boot file system on an LVM logical volume. This limitation exists for the following reasons:

  • On EFI systems, the EFI System Partition conventionally serves as the /boot file system. The uEFI standard requires a specific GPT partition type and a specific file system type for this partition.
  • RHEL 8 uses the Boot Loader Specification (BLS) for system boot entries. This specification requires that the /boot file system is readable by the platform firmware. On EFI systems, the platform firmware can read only the /boot configuration defined by the uEFI standard.
  • The support for LVM logical volumes in the GRUB 2 boot loader is incomplete. Red Hat does not plan to improve the support because the number of use cases for the feature is decreasing due to standards such as uEFI and BLS.

Red Hat does not plan to support /boot on LVM. Instead, Red Hat provides tools for managing system snapshots and rollback that do not need the /boot file system to be placed on an LVM logical volume.

(BZ#1496229)

LVM no longer allows creating volume groups with mixed block sizes

LVM utilities such as vgcreate or vgextend no longer allow you to create volume groups (VGs) where the physical volumes (PVs) have different logical block sizes. LVM has adopted this change because file systems fail to mount if you extend the underlying logical volume (LV) with a PV of a different block size.

To re-enable creating VGs with mixed block sizes, set the allow_mixed_block_sizes=1 option in the lvm.conf file.

(BZ#1768536)

The GRUB retries to access the disk after initial failures during boot

Sometimes, Storage Area Networks (SANs) fail to acknowledge the open and read disk calls. Previously, the GRUB tool used to enter into the grub_rescue prompt resulting in the boot failure. With this update, GRUB retries to access the disk up to 20 times after the initial call to open and read the disk fails. If the GRUB tool is still unable to open or read the disk after these attempts, it will enter into the grub_rescue mode.

(BZ#1987087)

XFS quota warnings are triggered too often

Using the quota timer results in quota warnings triggering too often, which causes soft quotas to be enforced faster than they should. To work around this problem, do not use soft quotas, which will prevent triggering warnings. As a result, the amount of warning messages will not enforce soft quota limit anymore, respecting the configured timeout.

(BZ#2059262)

10.11. Dynamic programming languages, web and database servers

MariaDB 10.5 does not warn about dropping a non-existent table when the OQGraph plug-in is enabled

When the OQGraph storage engine plug-in is loaded to the MariaDB 10.5 server, MariaDB does not warn about dropping a non-existent table. In particular, when the user attempts to drop a non-existent table using the DROP TABLE or DROP TABLE IF EXISTS SQL commands, MariaDB neither returns an error message nor logs a warning.

Note that the OQGraph plug-in is provided by the mariadb-oqgraph-engine package, which is not installed by default.

(BZ#1944653)

PAM plug-in version 1.0 does not work in MariaDB

MariaDB 10.3 provides the Pluggable Authentication Modules (PAM) plug-in version 1.0. MariaDB 10.5 provides the plug-in versions 1.0 and 2.0, version 2.0 is the default.

The MariaDB PAM plug-in version 1.0 does not work in RHEL 8. To work around this problem, use the PAM plug-in version 2.0 provided by the mariadb:10.5 module stream.

(BZ#1942330)

getpwnam() might fail when called by a 32-bit application

When a user of NIS uses a 32-bit application that calls the getpwnam() function, the call fails if the nss_nis.i686 package is missing. To work around this problem, manually install the missing package by using the yum install nss_nis.i686 command.

(BZ#1803161)

Symbol conflicts between OpenLDAP libraries might cause crashes in httpd

When both the libldap and libldap_r libraries provided by OpenLDAP are loaded and used within a single process, symbol conflicts between these libraries might occur. Consequently, Apache httpd child processes using the PHP ldap extension might terminate unexpectedly if the mod_security or mod_auth_openidc modules are also loaded by the httpd configuration.

Since the RHEL 8.3 update to the Apache Portable Runtime (APR) library, you can work around the problem by setting the APR_DEEPBIND environment variable, which enables the use of the RTLD_DEEPBIND dynamic linker option when loading httpd modules. When the APR_DEEPBIND environment variable is enabled, crashes no longer occur in httpd configurations that load conflicting libraries.

(BZ#1819607)

10.12. Compilers and development tools

Using CryptBlocks multiple times over the same input stream leads to incorrect encryption

When Go FIPS mode is enabled, AES CBC CryptBlocks incorrectly re-initializes the initialization vector. As a result, using CryptBlocks multiple times over the input stream encrypts files incorrectly. To work around this issue, do not reinitialize IV in the aes-cbc interface. This action allows files to be encrypted correctly.

(BZ#1972825)

10.13. Identity Management

Windows Server 2008 R2 and earlier no longer supported

In RHEL 8.4 and later, Identity Management (IdM) does not support establishing trust to Active Directory with Active Directory domain controllers running Windows Server 2008 R2 or earlier versions. RHEL IdM now requires SMB encryption when establishing the trust relationship, which is only available with Windows Server 2012 or later.

(BZ#1971061)

Using the cert-fix utility with the --agent-uid pkidbuser option breaks Certificate System

Using the cert-fix utility with the --agent-uid pkidbuser option corrupts the LDAP configuration of Certificate System. As a consequence, Certificate System might become unstable and manual steps are required to recover the system.

(BZ#1729215)

FreeRADIUS silently truncates Tunnel-Passwords longer than 249 characters

If a Tunnel-Password is longer than 249 characters, the FreeRADIUS service silently truncates it. This may lead to unexpected password incompatibilities with other systems.

To work around the problem, choose a password that is 249 characters or fewer.

(BZ#1723362)

The /var/log/lastlog sparse file on IdM hosts can cause performance problems

During the IdM installation, a range of 200,000 UIDs from a total of 10,000 possible ranges is randomly selected and assigned. Selecting a random range in this way significantly reduces the probability of conflicting IDs in case you decide to merge two separate IdM domains in the future.

However, having high UIDs can create problems with the /var/log/lastlog file. For example, if a user with the UID of 1280000008 logs in to an IdM client, the local /var/log/lastlog file size increases to almost 400 GB. Although the actual file is sparse and does not use all that space, certain applications are not designed to identify sparse files by default and may require a specific option to handle them. For example, if the setup is complex and a backup and copy application does not handle sparse files correctly, the file is copied as if its size was 400 GB. This behavior can cause performance problems.

To work around this problem:

  • In case of a standard package, refer to its documentation to identify the option that handles sparse files.
  • In case of a custom application, ensure that it is able to manage sparse files such as /var/log/lastlog correctly.

(JIRA:RHELPLAN-59111)

FIPS mode does not support using a shared secret to establish a cross-forest trust

Establishing a cross-forest trust using a shared secret fails in FIPS mode because NTLMSSP authentication is not FIPS-compliant. To work around this problem, authenticate with an Active Directory (AD) administrative account when establishing a trust between an IdM domain with FIPS mode enabled and an AD domain.

(BZ#1924707)

FreeRADIUS server fails to run in FIPS mode

By default, in FIPS mode, OpenSSL disables the use of the MD5 digest algorithm. As the RADIUS protocol requires MD5 to encrypt a secret between the RADIUS client and the RADIUS server, this causes the FreeRADIUS server to fail in FIPS mode.

To work around this problem, follow these steps:

Procedure

  1. Create the environment variable, RADIUS_MD5_FIPS_OVERRIDE for the radiusd service:

    systemctl edit radiusd
    
    [Service]
    Environment=RADIUS_MD5_FIPS_OVERRIDE=1
  2. To apply the change, reload the systemd configuration and start the radiusd service:

    # systemctl daemon-reload
    # systemctl start radiusd
  3. To run FreeRADIUS in debug mode:

    # RADIUS_MD5_FIPS_OVERRIDE=1 radiusd -X

Note that though FreeRADIUS can run in FIPS mode, this does not mean that it is FIPS compliant as it uses weak ciphers and functions when in FIPS mode.

For more information on configuring FreeRADIUS authentication in FIPS mode, see How to configure FreeRADIUS authentication in FIPS mode.

(BZ#1958979)

Actions required when running Samba as a print server

With this update, the samba package no longer creates the /var/spool/samba/ directory. If you use Samba as a print server and use /var/spool/samba/ in the [printers] share to spool print jobs, SELinux prevents Samba users from creating files in this directory. Consequently, print jobs fail and the auditd service logs a denied message in /var/log/audit/audit.log. To avoid this problem after updating your system to RHEL 8.5:

  1. Search the [printers] share in the /etc/samba/smb.conf file.
  2. If the share definition contains path = /var/spool/samba/, update the setting and set the path parameter to /var/tmp/.
  3. Restart the smbd service:

    # systemctl restart smbd

If you newly installed Samba on RHEL 8.5, no action is required. The default /etc/samba/smb.conf file provided by the samba-common package on RHEL 8.5 already uses the /var/tmp/ directory to spool print jobs.

(BZ#2009213)

The default keyword for enabled ciphers in the NSS does not work in conjunction with other ciphers

In Directory Server you can use the default keyword to refer to the default ciphers enabled in the network security services (NSS). However, if you want to enable the default ciphers and additional ones using the command line or web console, Directory Server fails to resolve the default keyword. As a consequence, the server enables only the additionally specified ciphers and logs the following error:

Security Initialization - SSL alert: Failed to set SSL cipher preference information: invalid ciphers <default,+__cipher_name__>: format is +cipher1,-cipher2... (Netscape Portable Runtime error 0 - no error)

As a workaround, specify all ciphers that are enabled by default in NSS including the ones you want to additionally enable.

(BZ#1817505)

Potential risk when using the default value for ldap_id_use_start_tls option

When using ldap:// without TLS for identity lookups, it can pose a risk for an attack vector. Particularly a man-in-the-middle (MITM) attack which could allow an attacker to impersonate a user by altering, for example, the UID or GID of an object returned in an LDAP search.

Currently, the SSSD configuration option to enforce TLS, ldap_id_use_start_tls, defaults to false. Ensure that your setup operates in a trusted environment and decide if it is safe to use unencrypted communication for id_provider = ldap. Note id_provider = ad and id_provider = ipa are not affected as they use encrypted connections protected by SASL and GSSAPI.

If it is not safe to use unencrypted communication, enforce TLS by setting the ldap_id_use_start_tls option to true in the /etc/sssd/sssd.conf file. The default behavior is planned to be changed in a future release of RHEL.

(JIRA:RHELPLAN-155168)

10.14. Desktop

Disabling flatpak repositories from Software Repositories is not possible

Currently, it is not possible to disable or remove flatpak repositories in the Software Repositories tool in the GNOME Software utility.

(BZ#1668760)

Drag-and-drop does not work between desktop and applications

Due to a bug in the gnome-shell-extensions package, the drag-and-drop functionality does not currently work between desktop and applications. Support for this feature will be added back in a future release.

(BZ#1717947)

Generation 2 RHEL 8 virtual machines sometimes fail to boot on Hyper-V Server 2016 hosts

When using RHEL 8 as the guest operating system on a virtual machine (VM) running on a Microsoft Hyper-V Server 2016 host, the VM in some cases fails to boot and returns to the GRUB boot menu. In addition, the following error is logged in the Hyper-V event log:

The guest operating system reported that it failed with the following error code: 0x1E

This error occurs due to a UEFI firmware bug on the Hyper-V host. To work around this problem, use Hyper-V Server 2019 as the host.

(BZ#1583445)

Current limitations of Flatpak

You can install certain applications using the Flatpak package manager. However, Flatpak is currently missing certain functions or features. Notably:

  • Flatpak is missing CVEs and changelog functionality parity. Using the GNOME Software application for Flatpak applications currently provides no information about the respective package or any CVEs.
  • GPG key checking is disabled by default when adding Red Hat Flatpak remote repositories.
  • Flatpak applications do not have unique icons. In Gnome Software, an application shows both the rpm and Flatpak versions. As a workaround, you can find the application origin by clicking Show Details on the respective icon.
  • Flatpak applications are unable to process Kerberos tickets.
  • Printing from Flatpak applications is currently unavailable.
  • Red Hat Flatpak remote is not automatically added. To enable them, follow the instructions in the product documentation: Enabling the Red Hat Flatpak remote

(JIRA:RHELPLAN-100230)

10.15. Graphics infrastructures

Multiple HDR displays on a single MST topology may not power on

On systems using NVIDIA Turing GPUs with the nouveau driver, using a DisplayPort hub (such as a laptop dock) with multiple monitors which support HDR plugged into it may result in failure to turn on. This is due to the system erroneously thinking there is not enough bandwidth on the hub to support all of the displays.

(BZ#1812577)

radeon fails to reset hardware correctly

The radeon kernel driver currently does not reset hardware in the kexec context correctly. Instead, radeon falls over, which causes the rest of the kdump service to fail.

To work around this problem, disable radeon in kdump by adding the following line to the /etc/kdump.conf file:

dracut_args --omit-drivers "radeon"
force_rebuild 1

Restart the machine and kdump. After starting kdump, the force_rebuild 1 line may be removed from the configuration file.

Note that in this scenario, no graphics will be available during kdump, but kdump will work successfully.

(BZ#1694705)

GUI in ESXi might crash due to low video memory

The graphical user interface (GUI) on RHEL virtual machines (VMs) in the VMware ESXi 7.0.1 hypervisor with vCenter Server 7.0.1 requires a certain amount of video memory. If you connect multiple consoles or high-resolution monitors to the VM, the GUI requires least 16 MB of video memory. If you start the GUI with less video memory, the GUI might terminate unexpectedly.

To work around the problem, configure the hypervisor to assign at least 16 MB of video memory to the VM. As a result, the GUI on the VM no longer crashes.

(BZ#1910358)

VNC Viewer displays wrong colors with the 16-bit color depth on IBM Z

The VNC Viewer application displays wrong colors when you connect to a VNC session on an IBM Z server with the 16-bit color depth.

To work around the problem, set the 24-bit color depth on the VNC server. With the Xvnc server, replace the -depth 16 option with -depth 24 in the Xvnc configuration.

As a result, VNC clients display the correct colors but use more network bandwidth with the server.

(BZ#1886147)

Matrox GPU with a VGA display shows no output

Your display might show no graphical output if you use the following system configuration:

  • A GPU in the Matrox MGA G200 family
  • A display connected over the VGA controller
  • UEFI switched to legacy mode

As a consequence, you cannot use or install RHEL on this configuration.

To work around the problem, use the following procedure:

  1. Boot the system to the boot loader menu.
  2. Add the nomodeset option to the kernel command line.

As a result, RHEL boots and shows graphical output as expected, but the maximum resolution is limited.

(BZ#1953926)

Unable to run graphical applications using sudo command

When trying to run graphical applications as a user with elevated privileges, the application fails to open with an error message. The failure happens because Xwayland is restricted by the Xauthority file to use regular user credentials for authentication.

To work around this problem, use the sudo -E command to run graphical applications as a root user.

(BZ#1673073)

Hardware acceleration is not supported on ARM

Built-in graphics drivers do not support hardware acceleration or the Vulkan API on the 64-bit ARM architecture.

To enable hardware acceleration or Vulkan on ARM, install the proprietary Nvidia driver.

(JIRA:RHELPLAN-57914)

10.16. Virtualization

Hot unplugging an IBMVFC device on PowerVM fails

When using a virtual machine (VM) with a RHEL 8 guest operating system on the PowerVM hypervisor, attempting to remove an IBM Power Virtual Fibre Channel (IBMVFC) device from the running VM currently fails. Instead, it displays an outstanding translation error.

To work around this problem, remove the IBMVFC device when the VM is shut down.

(BZ#1959020)

IBM POWER hosts may crash when using the ibmvfc driver

When running RHEL 8 on a PowerVM logical partition (LPAR), a variety of errors may currently occur due to problems with the ibmvfc driver. As a consequence, the host’s kernel may panic under certain circumstances, such as:

  • Using the Live Partition Mobility (LPM) feature
  • Resetting a host adapter
  • Using SCSI error handling (SCSI EH) functions

(BZ#1961722)

Using perf kvm record on IBM POWER Systems can cause the VM to crash

When using a RHEL 8 host on the little-endian variant of IBM POWER hardware, using the perf kvm record command to collect trace event samples for a KVM virtual machine (VM) in some cases results in the VM becoming unresponsive. This situation occurs when:

  • The perf utility is used by an unprivileged user, and the -p option is used to identify the VM - for example perf kvm record -e trace_cycles -p 12345.
  • The VM was started using the virsh shell.

To work around this problem, use the perf kvm utility with the -i option to monitor VMs that were created using the virsh shell. For example:

# perf kvm record -e trace_imc/trace_cycles/  -p <guest pid> -i

Note that when using the -i option, child tasks do not inherit counters, and threads will therefore not be monitored.

(BZ#1924016)

Attaching LUN devices to virtual machines using virtio-blk does not work

The q35 machine type does not support transitional virtio 1.0 devices, and RHEL 8 therefore lacks support for features that were deprecated in virtio 1.0. In particular, it is not possible on a RHEL 8 host to send SCSI commands from virtio-blk devices. As a consequence, attaching a physical disk as a LUN device to a virtual machine fails when using the virtio-blk controller.

Note that physical disks can still be passed through to the guest operating system, but they should be configured with the device='disk' option rather than device='lun'.

(BZ#1777138)

Virtual machines with iommu_platform=on fail to start on IBM POWER

RHEL 8 currently does not support the iommu_platform=on parameter for virtual machines (VMs) on IBM POWER system. As a consequence, starting a VM with this parameter on IBM POWER hardware results in the VM becoming unresponsive during the boot process.

(BZ#1910848)

Windows Server 2016 virtual machines with Hyper-V enabled fail to boot when using certain CPU models

Currently, it is not possible to boot a virtual machine (VM) that uses Windows Server 2016 as the guest operating system, has the Hyper-V role enabled, and uses one of the following CPU models:

  • EPYC-IBPB
  • EPYC

To work around this problem, use the EPYC-v3 CPU model, or manually enable the xsaves CPU flag for the VM.

(BZ#1942888)

Migrating a POWER9 guest from a RHEL 7-ALT host to RHEL 8 fails

Currently, migrating a POWER9 virtual machine from a RHEL 7-ALT host system to RHEL 8 becomes unresponsive with a Migration status: active status.

To work around this problem, disable Transparent Huge Pages (THP) on the RHEL 7-ALT host, which enables the migration to complete successfully.

(BZ#1741436)

Using virt-customize sometimes causes guestfs-firstboot to fail

After modifying a virtual machine (VM) disk image using the virt-customize utility, the guestfs-firstboot service in some cases fails due to incorrect SELinux permissions. This causes a variety of problems during VM startup, such as failing user creation or system registration.

To avoid this problem, add the --selinux-relabel option to the virt-customize command.

(BZ#1554735)

Deleting a forward interface from a macvtap virtual network resets all connection counts of this network

Currently, deleting a forward interface from a macvtap virtual network with multiple forward interfaces also resets the connection status of the other forward interfaces of the network. As a consequence, the connection information in the live network XML is incorrect. Note, however, that this does not affect the functionality of the virtual network. To work around the issue, restart the libvirtd service on your host.

(BZ#1332758)

Virtual machines with SLOF fail to boot in netcat interfaces

When using a netcat (nc) interface to access the console of a virtual machine (VM) that is currently waiting at the Slimline Open Firmware (SLOF) prompt, the user input is ignored and VM stays unresponsive. To work around this problem, use the nc -C option when connecting to the VM, or use a telnet interface instead.

(BZ#1974622)

Mounting virtiofs directories fails in certain circumstances on RHEL 8 guests

Currently, when using the virtiofs feature to provide a host directory to a virtual machine (VM), mounting the directory on the VM fails with an "Operation not supported" error if the VM is using a RHEL 8.4 kernel but a RHEL 8.5 selinux-policy package.

To work around this issue, reboot the guest and boot it into the latest available kernel on the guest.

(BZ#1995558)

Virtual machines sometimes fail to start when using many virtio-blk disks

Adding a large number of virtio-blk devices to a virtual machine (VM) may exhaust the number of interrupt vectors available in the platform. If this occurs, the VM’s guest OS fails to boot, and displays a dracut-initqueue[392]: Warning: Could not boot error.

(BZ#1719687)

SMT CPU topology is not detected by VMs when using host passthrough mode on AMD EPYC

When a virtual machine (VM) boots with the CPU host passthrough mode on an AMD EPYC host, the TOPOEXT CPU feature flag is not present. Consequently, the VM is not able to detect a virtual CPU topology with multiple threads per core. To work around this problem, boot the VM with the EPYC CPU model instead of host passthrough.

(BZ#1740002)

10.17. RHEL in cloud environments

Setting static IP in a RHEL 8 virtual machine on a VMware host does not work

Currently, when using RHEL 8 as a guest operating system of a virtual machine (VM) on a VMware host, the DatasourceOVF function does not work correctly. As a consequence, if you use the cloud-init utility to set the VM’s network to static IP and then reboot the VM, the VM’s network will be changed to DHCP.

(BZ#1750862)

kdump sometimes does not start on Azure and Hyper-V

On RHEL 8 guest operating systems hosted on the Microsoft Azure or Hyper-V hypervisors, starting the kdump kernel in some cases fails when post-exec notifiers are enabled.

To work around this problem, disable crash kexec post notifiers:

# echo N > /sys/module/kernel/parameters/crash_kexec_post_notifiers

(BZ#1865745)

The SCSI host address sometimes changes when booting a Hyper-V VM with multiple guest disks

Currently, when booting a RHEL 8 virtual machine (VM) on the Hyper-V hypervisor, the host portion of the Host, Bus, Target, Lun (HBTL) SCSI address in some cases changes. As a consequence, automated tasks set up with the HBTL SCSI identification or device node in the VM do not work consistently. This occurs if the VM has more than one disk or if the disks have different sizes.

To work around the problem, modify your kickstart files, using one of the following methods:

Method 1: Use persistent identifiers for SCSI devices.

You can use for example the following powershell script to determine the specific device identifiers:

# Output what the /dev/disk/by-id/<value> for the specified hyper-v virtual disk.
# Takes a single parameter which is the virtual disk file.
# Note: kickstart syntax works with and without the /dev/ prefix.
param (
    [Parameter(Mandatory=$true)][string]$virtualdisk
)

$what = Get-VHD -Path $virtualdisk
$part = $what.DiskIdentifier.ToLower().split('-')

$p = $part[0]
$s0 = $p[6] + $p[7] + $p[4] + $p[5] + $p[2] + $p[3] + $p[0] + $p[1]

$p = $part[1]
$s1 =  $p[2] + $p[3] + $p[0] + $p[1]

[string]::format("/dev/disk/by-id/wwn-0x60022480{0}{1}{2}", $s0, $s1, $part[4])

You can use this script on the hyper-v host, for example as follows:

PS C:\Users\Public\Documents\Hyper-V\Virtual hard disks> .\by-id.ps1 .\Testing_8\disk_3_8.vhdx
/dev/disk/by-id/wwn-0x60022480e00bc367d7fd902e8bf0d3b4
PS C:\Users\Public\Documents\Hyper-V\Virtual hard disks> .\by-id.ps1 .\Testing_8\disk_3_9.vhdx
/dev/disk/by-id/wwn-0x600224807270e09717645b1890f8a9a2

Afterwards, the disk values can be used in the kickstart file, for example as follows:

part / --fstype=xfs --grow --asprimary --size=8192 --ondisk=/dev/disk/by-id/wwn-0x600224807270e09717645b1890f8a9a2
part /home --fstype="xfs" --grow --ondisk=/dev/disk/by-id/wwn-0x60022480e00bc367d7fd902e8bf0d3b4

As these values are specific for each virtual disk, the configuration needs to be done for each VM instance. It may, therefore, be useful to use the %include syntax to place the disk information into a separate file.

Method 2: Set up device selection by size.

A kickstart file that configures disk selection based on size must include lines similar to the following:

...

# Disk partitioning information is supplied in a file to kick start
%include /tmp/disks

...

# Partition information is created during install using the %pre section
%pre --interpreter /bin/bash --log /tmp/ks_pre.log

	# Dump whole SCSI/IDE disks out sorted from smallest to largest ouputting
	# just the name
	disks=(`lsblk -n -o NAME -l -b -x SIZE -d -I 8,3`) || exit 1

	# We are assuming we have 3 disks which will be used
	# and we will create some variables to represent
	d0=${disks[0]}
	d1=${disks[1]}
	d2=${disks[2]}

	echo "part /home --fstype="xfs" --ondisk=$d2 --grow" >> /tmp/disks
	echo "part swap --fstype="swap" --ondisk=$d0 --size=4096" >> /tmp/disks
	echo "part / --fstype="xfs" --ondisk=$d1 --grow" >> /tmp/disks
	echo "part /boot --fstype="xfs" --ondisk=$d1 --size=1024" >> /tmp/disks

%end

(BZ#1906870)

10.18. Supportability

redhat-support-tool does not work with the FUTURE crypto policy

Because a cryptographic key used by a certificate on the Customer Portal API does not meet the requirements by the FUTURE system-wide cryptographic policy, the redhat-support-tool utility does not work with this policy level at the moment.

To work around this problem, use the DEFAULT crypto policy while connecting to the Customer Portal API.

(BZ#1802026)

10.19. Containers

Rootless containers with fuse-overlayfs do not recognize removed files

In RHEL 8.4 and earlier, rootless images and containers were created or stored using the fuse-overlayfs file system. Using such images and containers in RHEL 8.5 and later might introduce problems for unprivileged users using the overlayfs implementation provided by the kernel and who had removed files or directories from a container or from an image in RHEL 8.4. This issue does not apply to containers created by the root account.

For example, files or directories that are to be removed from a container or from an image are marked as such using the whiteout format when using the fuse-overlayfs file system. However, due to differences in the format, the kernel overlayfs implementation does not recognize the whiteout format created by fuse-overlayfs. As a result, any removed files or directories still appear. This problem does not apply to containers that were created by the root account.

To work around this problem, use one of the following options:

  1. Remove all of the unprivileged user’s containers and images using the podman unshare rm -rf $HOME/.local/share/containers/* command. When a user next runs Podman, the $HOME/.local/share/containers directory is recreated, and they will need to recreate their containers.
  2. Continue to run the podman command as a rootless user. The first time an updated version of podman is invoked on the system, it scans all of the files in the $HOME/.local/share/containers directory, and detects whether or not to use fuse-overlayfs. Podman records the result of the scan so that it will not re-run the scan later. As a result, the removed files do not appear.

The time required to detect if fuse-overlayfs is still necessary is dependent on the number of files and directories in the containers and images that need to be scanned.

(JIRA:RHELPLAN-92741)

Running systemd within an older container image does not work

Running systemd within an older container image, for example, centos:7, does not work:

$ podman run --rm -ti centos:7 /usr/lib/systemd/systemd
 Storing signatures
 Failed to mount cgroup at /sys/fs/cgroup/systemd: Operation not permitted
 [!!!!!!] Failed to mount API filesystems, freezing.

To work around this problem, use the following commands:

# mkdir /sys/fs/cgroup/systemd
# mount none -t cgroup -o none,name=systemd /sys/fs/cgroup/systemd
# podman run --runtime /usr/bin/crun --annotation=run.oci.systemd.force_cgroup_v1=/sys/fs/cgroup --rm -ti centos:7 /usr/lib/systemd/systemd

(JIRA:RHELPLAN-96940)

Container images signed with a Beta GPG key can not be pulled

Currently, when you try to pull RHEL Beta container images, podman exits with the error message: Error: Source image rejected: None of the signatures were accepted. The images fail to be pulled due to current builds being configured to not trust the RHEL Beta GPG keys by default.

As a workaround, ensure that the Red Hat Beta GPG key is stored on your local system and update the existing trust scope with the podman image trust set command for the appropriate beta namespace.

If you do not have the Beta GPG key stored locally, you can pull it by running the following command:

sudo wget -O /etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-beta https://www.redhat.com/security/data/f21541eb.txt

To add the Beta GPG key as trusted to your namespace, use one of the following commands:

$ sudo podman image trust set -f /etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-beta registry.access.redhat.com/namespace

and

$ sudo podman image trust set -f /etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-beta registry.redhat.io/namespace

Replace namespace with ubi9-beta or rhel9-beta.

(BZ#2020301)