Chapter 10. Known issues

This part describes known issues in Red Hat Enterprise Linux 8.4.

10.1. Installer and image creation

The auth and authconfig Kickstart commands require the AppStream repository

The authselect-compat package is required by the auth and authconfig Kickstart commands during installation. Without this package, the installation fails if auth or authconfig are used. However, by design, the authselect-compat package is only available in the AppStream repository.

To work around this problem, verify that the BaseOS and AppStream repositories are available to the installer or use the authselect Kickstart command during installation.

(BZ#1640697)

The reboot --kexec and inst.kexec commands do not provide a predictable system state

Performing a RHEL installation with the reboot --kexec Kickstart command or the inst.kexec kernel boot parameters do not provide the same predictable system state as a full reboot. As a consequence, switching to the installed system without rebooting can produce unpredictable results.

Note that the kexec feature is deprecated and will be removed in a future release of Red Hat Enterprise Linux.

(BZ#1697896)

Network access is not enabled by default in the installation program

Several installation features require network access, for example, registration of a system using the Content Delivery Network (CDN), NTP server support, and network installation sources. However, network access is not enabled by default, and as a result, these features cannot be used until network access is enabled.

To work around this problem, add ip=dhcp to boot options to enable network access when the installation starts. Optionally, passing a Kickstart file or a repository located on the network using boot options also resolves the problem. As a result, the network-based installation features can be used.

(BZ#1757877)

The USB CD-ROM drive is not available as an installation source in Anaconda

Installation fails when the USB CD-ROM drive is the source for it and the Kickstart ignoredisk --only-use= command is specified. In this case, Anaconda cannot find and use this source disk.

To work around this problem, use the harddrive --partition=sdX --dir=/ command to install from USB CD-ROM drive. As a result, the installation does not fail.

(BZ#1914955)

Anaconda does not show encryption for a custom partition

The Encrypt my data radio button is not available when you choose the Custom partitioning during the system installation. As a result, your data is not encrypted when installation is complete.

To workaround this problem, set encryption in the custom partitioning screen for each device you want to encrypt. Anaconda will ask for a passphrase when leaving the dialog.

(BZ#1903786)

Installation program attempts automatic partitioning when no partitioning scheme is specified in the Kickstart file

When using a Kickstart file to perform an automated installation, the installation program attempts to perform automatic partitioning even when you do not specify any partitioning commands in the Kickstart file. The installation program behaves as if the autopart command was used in the Kickstart file, resulting in unexpected partitions. To work around this problem, use the reqpart command in the Kickstart file so that you can interactively configure manual partitioning.

(BZ#1954408)

The new osbuild-composer back end does not replicate the blueprint state from lorax-composer on upgrades

Image Builder users that are upgrading from the lorax-composer back end to the new osbuild-composer back end, blueprints can disappear. As a result, once the upgrade is complete, the blueprints do not display automatically. To work around this problem, perform the following steps.

Prerequisites

You have the composer-cli CLI utility installed.

Procedure

Run the command to load the previous lorax-composer based blueprints into the new osbuild-composer back end:

$ for blueprint in $(find /var/lib/lorax/composer/blueprints/git/workspace/master -name '*.toml'); do composer-cli blueprints push "${blueprint}"; done

As a result, the same blueprints are now available in osbuild-composer back end.

Additional resources

For more details about this Known Issue, see the Image Builder blueprints are no longer present following an update to Red Hat Enterprise Linux 8.3 article.

(BZ#1897383)

Adding the same username in both blueprint and Kickstart files causes Edge image installation to fail

To install a RHEL for Edge image, users must create a blueprint to build a rhel-edge-container image and also create a Kickstart file to install the RHEL for Edge image. When a user adds the same username, password, and SSH key in both the blueprint and the Kickstart file, the RHEL for Edge image installation fails. Currently, there is no workaround.

(BZ#1951964)

GUI installation might fail if an attempt to unregister using the CDN is made before the repository refresh is completed

Since RHEL 8.2, when registering your system and attaching subscriptions using the Content Delivery Network (CDN), a refresh of the repository metadata is started by the GUI installation program. The refresh process is not part of the registration and subscription process, and as a consequence, the Unregister button is enabled in the Connect to Red Hat window. Depending on the network connection, the refresh process might take more than a minute to complete. If you click the Unregister button before the refresh process is completed, the GUI installation might fail as the unregister process removes the CDN repository files and the certificates required by the installation program to communicate with the CDN.

To work around this problem, complete the following steps in the GUI installation after you have clicked the Register button in the Connect to Red Hat window:

From the Connect to Red Hat window, click Done to return to the Installation Summary window.
From the Installation Summary window, verify that the Installation Source and Software Selection status messages in italics are not displaying any processing information.
When the Installation Source and Software Selection categories are ready, click Connect to Red Hat.
Click the Unregister button.

After performing these steps, you can safely unregister the system during the GUI installation.

(BZ#1821192)

Registration fails for user accounts that belong to multiple organizations

Currently, when you attempt to register a system with a user account that belongs to multiple organizations, the registration process fails with the error message You must specify an organization for new units.

To work around this problem, you can either:

Use a different user account that does not belong to multiple organizations.
Use the Activation Key authentication method available in the Connect to Red Hat feature for GUI and Kickstart installations.
Skip the registration step in Connect to Red Hat and use Subscription Manager to register your system post-installation.

(BZ#1822880)

Red Hat Insights client fails to register the operating system when using the graphical installer

Currently, the installation fails with an error at the end, which points to the Insights client.

To work around this problem, uncheck the Connect to Red Hat Insights option during the Connect to Red Hat step before registering the systems in the installer.

As a result, you can complete the installation and register to Insights afterwards by using this command:

# insights-client --register

(BZ#1931069)

Installation with autopart utility fails with inconsistent disk sector sizes

Installing RHEL using autopart with multiple inconsistent disk sector sizes fails. As a workaround, use a plain partitioning scheme, for example autopart --type=plain, instead of the default LVM scheme. Another option is to try re-configuring sector sizes, for example by running hdparm --set-sector-size=<SIZE> <DEVICE>.

As a workaround for kickstart installations:

Restrict the disks used for the partitioning by specifying ignoredisk --drives=.. OR --only-use=...
Specify disks to be used for each created LVM Physical Volume: partition pv.1 --ondisk=...

As a workaround for manual installations:

Select only the disks with the same sector size during manual installation in graphical or text mode.
When disks with inconsistent sector size are selected for the installation, restrict each created LVM Volume Group to use Physical Volumes with the same sector size. This can only be done in graphical mode in the Custom partitioning spoke.

(BZ#1935722)

The GRUB retries to access the disk after initial failures during boot

Sometimes, Storage Area Networks (SANs) fail to acknowledge the open and read disk calls. Previously, the GRUB tool used to enter into the grub_rescue prompt resulting in the boot failure. With this update, GRUB retries to access the disk up to 20 times after the initial call to open and read the disk fails. If the GRUB tool is still unable to open or read the disk after these attempts, it will enter into the grub_rescue mode.

(BZ#1987087)

IBM Power systems with HASH MMU mode fail to boot with memory allocation failures

IBM Power Systems with HASH memory allocation unit (MMU) mode support kdump up to a maximum of 192 cores. Consequently, the system fails to boot with memory allocation failures if kdump is enabled on more than 192 cores. This limitation is due to RMA memory allocations during early boot in HASH MMU mode. To work around this problem, use the Radix MMU mode with fadump enabled instead of using kdump.

(BZ#2028361)

Unable to rebuild grub.cfg by using grub2-mkconfig on rhel-guest-image-8.4 images

The rhel-guest-image-8.4 type does not contain the entry 'GRUB_DEFAULT=saved' entry in the /etc/default/grub file. As a consequence, if you install a new kernel and rebuild the grub using the grub2-mkconfig -o /boot/grub2/grub.cfg command, after reboot, the system will not boot up with the new kernel. To work around this issue, you can append the GRUB_DEFAULT=saved to the /etc/default/grub file. As a result, the system should boot up with the new kernel.

(BZ#2227218)

10.2. Subscription management

syspurpose addons have no effect on the subscription-manager attach --auto output.

In Red Hat Enterprise Linux 8, four attributes of the syspurpose command-line tool have been added: role,usage, service_level_agreement and addons. Currently, only role, usage and service_level_agreement affect the output of running the subscription-manager attach --auto command. Users who attempt to set values to the addons argument will not observe any effect on the subscriptions that are auto-attached.

(BZ#1687900)

10.3. Infrastructure services

Postfix TLS fingerprint algorithm in the FIPS mode needs to be changed to SHA-256

By default in RHEL 8, postfix uses MD5 fingerprints with the TLS for backward compatibility. But in the FIPS mode, the MD5 hashing function is not available, which may cause TLS to incorrectly function in the default postfix configuration. To workaround this problem, the hashing function needs to be changed to SHA-256 in the postfix configuration file.

For more details, see the related Knowledgebase article Fix postfix TLS in the FIPS mode by switching to SHA-256 instead of MD5.

(BZ#1711885)

10.4. Security

Users can run sudo commands as locked users

In systems where sudoers permissions are defined with the ALL keyword, sudo users with permissions can run sudo commands as users whose accounts are locked. Consequently, locked and expired accounts can still be used to execute commands.

To work around this problem, enable the newly implemented runas_check_shell option together with proper settings of valid shells in /etc/shells. This prevents attackers from running commands under system accounts such as bin.

(BZ#1786990)

libselinux-python is available only through its module

The libselinux-python package contains only Python 2 bindings for developing SELinux applications and it is used for backward compatibility. For this reason, libselinux-python is no longer available in the default RHEL 8 repositories through the dnf install libselinux-python command.

To work around this problem, enable both the libselinux-python and python27 modules, and install the libselinux-python package and its dependencies with the following commands:

# dnf module enable libselinux-python
# dnf install libselinux-python

Alternatively, install libselinux-python using its install profile with a single command:

# dnf module install libselinux-python:2.8/common

As a result, you can install libselinux-python using the respective module.

(BZ#1666328)

udica processes UBI 8 containers only when started with --env container=podman

The Red Hat Universal Base Image 8 (UBI 8) containers set the container environment variable to the oci value instead of the podman value. This prevents the udica tool from analyzing a container JavaScript Object Notation (JSON) file.

To work around this problem, start a UBI 8 container using a podman command with the --env container=podman parameter. As a result, udica can generate an SELinux policy for a UBI 8 container only when you use the described workaround.

(BZ#1763210)

Negative effects of the default logging setup on performance

The default logging environment setup might consume 4 GB of memory or even more and adjustments of rate-limit values are complex when systemd-journald is running with rsyslog.

See the Negative effects of the RHEL default logging setup on performance and their mitigations Knowledgebase article for more information.

(JIRA:RHELPLAN-10431)

File permissions of /etc/passwd- are not aligned with the CIS RHEL 8 Benchmark 1.0.0

Because of an issue with the CIS Benchmark, the remediation of the SCAP rule that ensures permissions on the /etc/passwd- backup file configures permissions to 0644. However, the CIS Red Hat Enterprise Linux 8 Benchmark 1.0.0 requires file permissions 0600 for that file. As a consequence, the file permissions of /etc/passwd- are not aligned with the benchmark after remediation.

(BZ#1858866)

SELINUX=disabled in /etc/selinux/config does not work properly

Disabling SELinux using the SELINUX=disabled option in the /etc/selinux/config results in a process in which the kernel boots with SELinux enabled and switches to disabled mode later in the boot process. This might cause memory leaks.

To work around this problem, disable SELinux by adding the selinux=0 parameter to the kernel command line as described in the Changing SELinux modes at boot time section of the Using SELinux title if your scenario really requires to completely disable SELinux.

(JIRA:RHELPLAN-34199)

crypto-policies incorrectly allow Camellia ciphers

The RHEL 8 system-wide cryptographic policies should disable Camellia ciphers in all policy levels, as stated in the product documentation. However, the Kerberos protocol enables the ciphers by default.

To work around the problem, apply the NO-CAMELLIA subpolicy:

# update-crypto-policies --set DEFAULT:NO-CAMELLIA

In the previous command, replace DEFAULT with the cryptographic level name if you have switched from DEFAULT previously.

As a result, Camellia ciphers are correctly disallowed across all applications that use system-wide crypto policies only when you disable them through the workaround.

(BZ#1919155)

Connections to servers with SHA-1 signatures do not work with GnuTLS

SHA-1 signatures in certificates are rejected by the GnuTLS secure communications library as insecure. Consequently, applications that use GnuTLS as a TLS backend cannot establish a TLS connection to peers that offer such certificates. This behavior is inconsistent with other system cryptographic libraries.

To work around this problem, upgrade the server to use certificates signed with SHA-256 or stronger hash, or switch to the LEGACY policy.

(BZ#1628553)

Libreswan ignores the leftikeport and rightikeport options

Libreswan ignores the leftikeport and rightikeport options in any host-to-host Libreswan connections. As a consequence, Libreswan uses the default ports regardless of leftikeport and rightikeport settings. No workaround is available at the moment.

(BZ#1934058)

Using multiple labeled IPsec connections with IKEv2 do not work correctly

When Libreswan uses the IKEv2 protocol, security labels for IPsec do not work correctly for more than one connection. As a consequence, Libreswan using labeled IPsec can establish only the first connection, but cannot establish subsequent connections correctly. To use more than one connection, use the IKEv1 protocol.

(BZ#1934859)

OpenSSL in FIPS mode accepts only specific D-H parameters

In FIPS mode, TLS clients that use OpenSSL return a bad dh value error and abort TLS connections to servers that use manually generated parameters. This is because OpenSSL, when configured to work in compliance with FIPS 140-2, works only with Diffie-Hellman parameters compliant to NIST SP 800-56A rev3 Appendix D (groups 14, 15, 16, 17, and 18 defined in RFC 3526 and with groups defined in RFC 7919). Also, servers that use OpenSSL ignore all other parameters and instead select known parameters of similar size. To work around this problem, use only the compliant groups.

(BZ#1810911)

Smart-card provisioning process through OpenSC pkcs15-init does not work properly

The file_caching option is enabled in the default OpenSC configuration, and the file caching functionality does not handle some commands from the pkcs15-init tool properly. Consequently, the smart-card provisioning process through OpenSC fails.

To work around the problem, add the following snippet to the /etc/opensc.conf file:

app pkcs15-init {
        framework pkcs15 {
                use_file_caching = false;
        }
}

The smart-card provisioning through pkcs15-init only works if you apply the previously described workaround.

(BZ#1947025)

systemd cannot execute commands from arbitrary paths

The systemd service cannot execute commands from /home/user/bin arbitrary paths because the SELinux policy package does not include any such rule. Consequently, the custom services that are executed on non-system paths fail and eventually log the Access Vector Cache (AVC) denial audit messages when SELinux denied access. To work around this problem, do one of the following:

Execute the command using a shell script with the -c option. For example,
```
bash -c command
```
Execute the command from a common path using /bin, /sbin, /usr/sbin, /usr/local/bin, and /usr/local/sbin common directories.

(BZ#1860443)

selinux-policy prevents IPsec from working over TCP

The libreswan package in RHEL 8.4 supports IPsec-based VPNs using TCP encapsulation. However, the selinux-policy package does not reflect this update. As a consequence, when you set Libreswan to use TCP, the ipsec service fails to bind to the given TCP port.

To work around the problem, use a custom SELinux policy:

Open a new .cil file in a text editor, for example:
```
# vim local_ipsec_tcp_listen.cil
```

Insert the following rule:

(allow ipsec_t ipsecnat_port_t (tcp_socket (name_bind name_connect)))

Save and close the file.

Install the policy module:

# semodule -i local_ipsec_tcp_listen.cil

Restart the ipsec service:
```
# systemctl restart ipsec
```

As a result, Libreswan can bind and connect to the commonly used 4500/tcp port.

(BZ#1931848)

Installation with the Server with GUI or Workstation software selections and CIS security profile is not possible

The CIS security profile is not compatible with the Server with GUI and Workstation software selections. As a consequence, a RHEL 8 installation with the Server with GUI software selection and CIS profile is not possible. An attempted installation using the CIS profile and either of these software selections will generate the error message:

package xorg-x11-server-common has been added to the list of excluded packages, but it can't be removed from the current software selection without breaking the installation.

To work around the problem, do not use the CIS security profile with the Server with GUI or Workstation software selections.

(BZ#1843932)

rpm_verify_permissions fails in the CIS profile

The rpm_verify_permissions rule compares file permissions to package default permissions. However, the Center for Internet Security (CIS) profile, which is provided by the scap-security-guide packages, changes some file permissions to be more strict than default. As a consequence, verification of certain files using rpm_verify_permissions fails.

To work around this problem, manually verify that these files have the following permissions:

/etc/cron.d (0700)
/etc/cron.hourly (0700)
/etc/cron.monthly (0700)
/etc/crontab (0600)
/etc/cron.weekly (0700)
/etc/cron.daily (0700)

(BZ#1843913)

Kickstart uses org_fedora_oscap instead of com_redhat_oscap in RHEL 8

The Kickstart references the Open Security Content Automation Protocol (OSCAP) Anaconda add-on as org_fedora_oscap instead of com_redhat_oscap which might cause confusion. That is done to preserve backward compatibility with Red Hat Enterprise Linux 7.

(BZ#1665082)

Certain sets of interdependent rules in SSG can fail

Remediation of SCAP Security Guide (SSG) rules in a benchmark can fail due to undefined ordering of rules and their dependencies. If two or more rules need to be executed in a particular order, for example, when one rule installs a component and another rule configures the same component, they can run in the wrong order and remediation reports an error. To work around this problem, run the remediation twice, and the second run fixes the dependent rules.

(BZ#1750755)

OSCAP Anaconda Addon does not install all packages in text mode

The OSCAP Anaconda Addon plugin cannot modify the list of packages selected for installation by the system installer if the installation is running in text mode. Consequently, when a security policy profile is specified using Kickstart and the installation is running in text mode, any additional packages required by the security policy are not installed during installation.

To work around this problem, either run the installation in graphical mode or specify all packages that are required by the security policy profile in the security policy in the %packages section in your Kickstart file.

As a result, packages that are required by the security policy profile are not installed during RHEL installation without one of the described workarounds, and the installed system is not compliant with the given security policy profile.

(BZ#1674001)

OSCAP Anaconda Addon does not correctly handle customized profiles

The OSCAP Anaconda Addon plugin does not properly handle security profiles with customizations in separate files. Consequently, the customized profile is not available in the RHEL graphical installation even when you properly specify it in the corresponding Kickstart section.

To work around this problem, follow the instructions in the Creating a single SCAP data stream from an original DS and a tailoring file Knowledgebase article. As a result of this workaround, you can use a customized SCAP profile in the RHEL graphical installation.

(BZ#1691305)

Remediating service-related rules during kickstart installations might fail

During a kickstart installation, the OpenSCAP utility sometimes incorrectly shows that a service enable or disable state remediation is not needed. Consequently, OpenSCAP might set the services on the installed system to a non-compliant state. As a workaround, you can scan and remediate the system after the kickstart installation. This will fix the service-related issues.

(BZ#1834716)

Certain rsyslog priority strings do not work correctly

Support for the GnuTLS priority string for imtcp that allows fine-grained control over encryption is not complete. Consequently, the following priority strings do not work properly in rsyslog:

NONE:+VERS-ALL:-VERS-TLS1.3:+MAC-ALL:+DHE-RSA:+AES-256-GCM:+SIGN-RSA-SHA384:+COMP-ALL:+GROUP-ALL

To work around this problem, use only correctly working priority strings:

NONE:+VERS-ALL:-VERS-TLS1.3:+MAC-ALL:+ECDHE-RSA:+AES-128-CBC:+SIGN-RSA-SHA1:+COMP-ALL:+GROUP-ALL

As a result, current configurations must be limited to the strings that work correctly.

(BZ#1679512)

Conflict in SELinux Audit rules and SELinux boolean configurations

If the Audit rule list includes an Audit rule that contains a subj_* or obj_* field, and the SELinux boolean configuration changes, setting the SELinux booleans causes a deadlock. As a consequence, the system stops responding and requires a reboot to recover. To work around this problem, disable all Audit rules containing the subj_* or obj_* field, or temporarily disable such rules before changing SELinux booleans.

With the release of the RHSA-2021:2168 advisory, the kernel handles this situation properly and no longer deadlocks.

(BZ#1924230)

10.5. Networking

The nm-cloud-setup service removes manually-configured secondary IP addresses from interfaces

Based on the information received from the cloud environment, the nm-cloud-setup service configures network interfaces. Disable nm-cloud-setup to manually configure interfaces. However, in certain cases, other services on the host can configure interfaces as well. For example, these services could add secondary IP addresses. To avoid that nm-cloud-setup removes secondary IP addresses:

Stop and disable the nm-cloud-setup service and timer:

# systemctl disable --now nm-cloud-setup.service nm-cloud-setup.timer

Display the available connection profiles:
```
# nmcli connection show
```
Reactive the affected connection profiles:
```
# nmcli connection up "<profile_name>"
```

As a result, the service no longer removes manually-configured secondary IP addresses from interfaces.

(BZ#2132754)

IPsec network traffic fails during IPsec offloading when GRO is disabled

IPsec offloading is not expected to work when Generic Receive Offload (GRO) is disabled on the device. If IPsec offloading is configured on a network interface and GRO is disabled on that device, IPsec network traffic fails.

To work around this problem, keep GRO enabled on the device.

(BZ#1649647)

10.6. Kernel

Certain BCC utilities display a harmless warning

Due to macro redefinitions in some compiler specific kernel headers. Some BPF Compiler Collection (BCC) utilities show the following warning:

warning: __no_sanitize_address' macro redefined [-Wmacro-redefined]

The warning is harmless, and you can ignore it.

(BZ#1907271)

A vmcore capture fails after memory hot-plug or unplug operation

After performing the memory hot-plug or hot-unplug operation, the event comes after updating the device tree which contains memory layout information. Thereby the makedumpfile utility tries to access a non-existent physical address. The problem appears if all of the following conditions meet:

A little-endian variant of IBM Power System runs RHEL 8.
The kdump or fadump service is enabled on the system.

Consequently, the capture kernel fails to save vmcore if a kernel crash is triggered after the memory hot-plug or hot-unplug operation.

To work around this problem, restart the kdump service after hot-plug or hot-unplug:

# systemctl restart kdump.service

As a result, vmcore is successfully saved in the described scenario.

(BZ#1793389)

kdump fails to dump vmcore on SSH or NFS dump targets

The new version of dracut-network drops dependency on dhcp-client that requires an ipcalc. Consequently, when NIC port is configured to a static IP and kdump is configured to dump on SSH or NFS dump targets, kdump fails with the following error message:

ipcalc: command not found

To work around this problem:

Install the ipcalc package manually.
```
dnf install ipcalc
```
Rebuild the initramfs for kdump.
```
kdumpctrl rebuild
```
Restart the kdump service.
```
systemctl restart kdump
```

As a result, kdump is successful in the described scenario.

(BZ#1931266)

Debug kernel fails to boot in crash capture environment in RHEL 8

Due to memory-demanding nature of the debug kernel, a problem occurs when the debug kernel is in use and a kernel panic is triggered. As a consequence, the debug kernel is not able to boot as the capture kernel, and a stack trace is generated instead. To work around this problem, increase the crash kernel memory accordingly. As a result, the debug kernel successfully boots in the crash capture environment.

(BZ#1659609)

Memory allocation on crash kernel fails at boot time

On some Ampere Altra systems, memory allocation fails when the 32-bit region is disabled in BIOS settings. Consequently, the kdump service fails to start because the conventional memory is not large enough to reserve the memory allocation.

To work around this problem, enable 32-bit CPU in BIOS as follows:

Open the BIOS settings on your system.
Open the Chipset menu.
Under Memory Configuration, enable the Slave 32-bit option.

As a result, the crash kernel allocates memory within the 32-bit region and the kdump service works as expected.

(BZ#1940674)

Certain kernel drivers do not display their version

The behavior for module versioning of many networking kernel drivers has changed in RHEL 8.4. Consequently, those drivers now do not display their version. Alternatively, after executing the ethtool -i command, the drivers display the kernel version instead of the driver version. To work around this problem, users can run the following command:

# modinfo <AFFECTED_DRIVER> | grep rhelversion

As a result, users can determine versions of the affected kernel drivers in scenarios where it is necessary.

Note that the perceived amount of change in a driver version string has no actual bearing on the amount of change in the driver itself.

(BZ#1944639)

Using irqpoll causes vmcore generation failure

Due to an existing problem with the nvme driver on the 64-bit ARM architectures that run on the Amazon Web Services (AWS) cloud platforms, the vmcore generation fails when you provide the irqpoll kernel command line parameter to the first kernel. Consequently, no vmcore file is dumped in the /var/crash/ directory after a kernel crash. To work around this problem:

Append irqpoll to KDUMP_COMMANDLINE_REMOVE in the /etc/sysconfig/kdump file.

KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug quiet log_buf_len swiotlb"

Remove irqpoll from KDUMP_COMMANDLINE_APPEND in the /etc/sysconfig/kdump file.

KDUMP_COMMANDLINE_APPEND="irqpoll nr_cpus=1 reset_devices cgroup_disable=memory udev.children-max=2 panic=10 swiotlb=noforce novmcoredd"

Restart the kdump service by running the systemctl restart kdump command.

As a result, the first kernel boots correctly and the vmcore file is expected to be captured upon the kernel crash.

Note that the kdump service can use a significant amount of crash kernel memory to dump the vmcore file. Ensure that the capture kernel has sufficient memory available for the kdump service.

(BZ#1654962)

The HP NMI watchdog does not always generate a crash dump

In certain cases, the hpwdt driver for the HP NMI watchdog is not able to claim a non-maskable interrupt (NMI) generated by the HPE watchdog timer because the NMI was instead consumed by the perfmon driver.

The missing NMI is initiated by one of two conditions:

The Generate NMI button on the Integrated Lights-Out (iLO) server management software. This button is triggered by a user.
The hpwdt watchdog. The expiration by default sends an NMI to the server.

Both sequences typically occur when the system is unresponsive. Under normal circumstances, the NMI handler for both these situations calls the kernel panic() function and if configured, the kdump service generates a vmcore file.

Because of the missing NMI, however, kernel panic() is not called and vmcore is not collected.

In the first case (1.), if the system was unresponsive, it remains so. To work around this scenario, use the virtual Power button to reset or power cycle the server.

In the second case (2.), the missing NMI is followed 9 seconds later by a reset from the Automated System Recovery (ASR).

The HPE Gen9 Server line experiences this problem in single-digit percentages. The Gen10 at an even smaller frequency.

(BZ#1602962)

The tuned-adm profile powersave command causes the system to become unresponsive

Executing the tuned-adm profile powersave command leads to an unresponsive state of the Penguin Valkyrie 2000 2-socket systems with the older Thunderx (CN88xx) processors. Consequently, reboot the system to resume working. To work around this problem, avoid using the powersave profile if your system matches the mentioned specifications.

(BZ#1609288)

The kernel ACPI driver reports it has no access to a PCIe ECAM memory region

The Advanced Configuration and Power Interface (ACPI) table provided by firmware does not define a memory region on the PCI bus in the Current Resource Settings (_CRS) method for the PCI bus device. Consequently, the following warning message occurs during the system boot:

[    2.817152] acpi PNP0A08:00: [Firmware Bug]: ECAM area [mem 0x30000000-0x31ffffff] not reserved in ACPI namespace
[    2.827911] acpi PNP0A08:00: ECAM at [mem 0x30000000-0x31ffffff] for [bus 00-1f]

However, the kernel is still able to access the 0x30000000-0x31ffffff memory region, and can assign that memory region to the PCI Enhanced Configuration Access Mechanism (ECAM) properly. You can verify that PCI ECAM works correctly by accessing the PCIe configuration space over the 256 byte offset with the following output:

03:00.0 Non-Volatile memory controller: Sandisk Corp WD Black 2018/PC SN720 NVMe SSD (prog-if 02 [NVM Express])
 ...
        Capabilities: [900 v1] L1 PM Substates
                L1SubCap: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
                          PortCommonModeRestoreTime=255us PortTPowerOnTime=10us
                L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
                           T_CommonMode=0us LTR1.2_Threshold=0ns
                L1SubCtl2: T_PwrOn=10us

As a result, you can ignore the warning message.

For more information about the problem, see the "Firmware Bug: ECAM area mem 0x30000000-0x31ffffff not reserved in ACPI namespace" appears during system boot solution.

(BZ#1868526)

The hwloc commands with the default settings do not work on single CPU Power9 and Power10 LPARs

With the hwloc package of version 2.2.0, any single-node Non-Uniform Memory Access (NUMA) system that runs Power9 / Power10 CPU is considered to be "disallowed". Consequently, all hwloc commands do not work and the following error message is displayed:

Topology does not contain any NUMA node, aborting!

You can use either of these two options to work around this problem:

Set the environment variable HWLOC_ALLOW=all
Use the disallowed flag with various hwloc commands

As a result, the hwloc command does not return any errors in the described scenario.

(BZ#1917560)

The OPEN MPI library may trigger run-time failures with default PML

In OPEN Message Passing Interface (OPEN MPI) implementation 4.0.x series, Unified Communication X (UCX) is the default point-to-point communicator (PML). The later versions of OPEN MPI 4.0.x series deprecated openib Byte Transfer Layer (BTL).

However, OPEN MPI, when run over a homogeneous cluster (same hardware and software configuration), UCX still uses openib BTL for MPI one-sided operations. As a consequence, this may trigger execution errors. To work around this problem:

Run the mpirun command using following parameters:

-mca btl openib -mca pml ucx -x UCX_NET_DEVICES=mlx5_ib0

where,

The -mca btl openib parameter disables openib BTL
The -mca pml ucx parameter configures OPEN MPI to use ucx PML.
The x UCX_NET_DEVICES= parameter restricts UCX to use the specified devices

The OPEN MPI, when run over a heterogeneous cluster (different hardware and software configuration), it uses UCX as the default PML. As a consequence, this may cause the OPEN MPI jobs to run with erratic performance, unresponsive behavior, or crash failures. To work around this problem, set the UCX priority as:

Run the mpirun command using following parameters:

-mca pml_ucx_priority 5

As a result, the OPEN MPI library is able to choose an alternative available transport layer over UCX.

(BZ#1866402)

Connections fail when attaching a virtual function to virtual machine

Pensando network cards that use the ionic device driver silently accept VLAN tag configuration requests and attempt configuring network connections while attaching network virtual functions (VF) to a virtual machine (VM). Such network connections fail as this feature is not yet supported by the card’s firmware.

(BZ#1930576)

10.7. Hardware enablement

The default 7 4 1 7 printk value sometimes causes temporary system unresponsiveness

The default 7 4 1 7 printk value allows for better debugging of the kernel activity. However, when coupled with a serial console, this printk setting can cause intense I/O bursts that can lead to a RHEL system becoming temporarily unresponsive. To work around this problem, we have added a new optimize-serial-console TuneD profile, which reduces the default printk value to 4 4 1 7. Users can instrument their system as follows:

# tuned-adm profile throughput-performance optimize-serial-console

Having a lower printk value persistent across a reboot reduces the likelihood of system hangs.

Note that this setting change comes at the expense of losing the extra debugging information.

(JIRA:RHELPLAN-28940)

10.8. File systems and storage

The /boot file system cannot be placed on LVM

You cannot place the /boot file system on an LVM logical volume. This limitation exists for the following reasons:

On EFI systems, the EFI System Partition conventionally serves as the /boot file system. The uEFI standard requires a specific GPT partition type and a specific file system type for this partition.
RHEL 8 uses the Boot Loader Specification (BLS) for system boot entries. This specification requires that the /boot file system is readable by the platform firmware. On EFI systems, the platform firmware can read only the /boot configuration defined by the uEFI standard.
The support for LVM logical volumes in the GRUB 2 boot loader is incomplete. Red Hat does not plan to improve the support because the number of use cases for the feature is decreasing due to standards such as uEFI and BLS.

Red Hat does not plan to support /boot on LVM. Instead, Red Hat provides tools for managing system snapshots and rollback that do not need the /boot file system to be placed on an LVM logical volume.

(BZ#1496229)

LVM no longer allows creating volume groups with mixed block sizes

LVM utilities such as vgcreate or vgextend no longer allow you to create volume groups (VGs) where the physical volumes (PVs) have different logical block sizes. LVM has adopted this change because file systems fail to mount if you extend the underlying logical volume (LV) with a PV of a different block size.

To re-enable creating VGs with mixed block sizes, set the allow_mixed_block_sizes=1 option in the lvm.conf file.

(BZ#1768536)

Limitations of LVM writecache

The writecache LVM caching method has the following limitations, which are not present in the cache method:

You cannot name a writecache logical volume when using pvmove commands.
You cannot use logical volumes with writecache in combination with thin pools or VDO.

The following limitation also applies to the cache method:

You cannot resize a logical volume while cache or writecache is attached to it.

(JIRA:RHELPLAN-27987, BZ#1798631, BZ#1808012)

LVM mirror devices that store a LUKS volume sometimes become unresponsive

Mirrored LVM devices with a segment type of mirror that store a LUKS volume might become unresponsive under certain conditions. The unresponsive devices reject all I/O operations.

To work around the issue, Red Hat recommends that you use LVM RAID 1 devices with a segment type of raid1 instead of mirror if you need to stack LUKS volumes on top of resilient software-defined storage.

The raid1 segment type is the default RAID configuration type and replaces mirror as the recommended solution.

To convert mirror devices to raid1, see Converting a mirrored LVM device to a RAID1 device.

(BZ#1730502)

An NFS 4.0 patch can result in reduced performance under an open-heavy workload

Previously, a bug was fixed that, in some cases, could cause an NFS open operation to overlook the fact that a file had been removed or renamed on the server. However, the fix may cause slower performance with workloads that require many open operations. To work around this problem, it might help to use NFS version 4.1 or higher, which have been improved to grant delegations to clients in more cases, allowing clients to perform open operations locally, quickly, and safely.

(BZ#1748451)

xfs_quota state doesn’t output all grace times when multiple quota types are specified

Currently, the xfs_quota state command doesn’t output the grace time for quotas as expected with options specifying multiple quota types. To work around this issue, specify the required quota type in command option individually, i. e. xfs_quota state -g, xfs_quota state -p or xfs_quota state -u.

(BZ#1949743)

10.9. High availability and clusters

The ocf:heartbeat:pgsql resource agent and any third-party agents that parse crm_mon output in their stop operation may fail to stop during a shutdown process in RHEL 8.4

In the RHEL 8.4 GA release, Pacemaker’s crm_mon command-line tool was modified to display a "shutting down" message rather than the usual cluster information when Pacemaker starts to shut down. As a result, shutdown progress, such as the stopping of resources, can not be monitored, and resource agents that parse crm_mon output in their stop operation (such as the ocf:heartbeat:pgsql agent distributed with the resource-agents package, or some custom or third-party agents) could fail to stop, leading to cluster problems.

It is recommended that clusters that use the ocf:heartbeat:pgsql resource agent not be upgraded to RHEL 8.4 until the z-stream is available.

(BZ#1948620)

10.10. Dynamic programming languages, web and database servers

getpwnam() might fail when called by a 32-bit application

When a user of NIS uses a 32-bit application that calls the getpwnam() function, the call fails if the nss_nis.i686 package is missing. To work around this problem, manually install the missing package by using the yum install nss_nis.i686 command.

(BZ#1803161)

Symbol conflicts between OpenLDAP libraries might cause crashes in httpd

When both the libldap and libldap_r libraries provided by OpenLDAP are loaded and used within a single process, symbol conflicts between these libraries might occur. Consequently, Apache httpd child processes using the PHP ldap extension might terminate unexpectedly if the mod_security or mod_auth_openidc modules are also loaded by the httpd configuration.

Since the RHEL 8.3 update to the Apache Portable Runtime (APR) library, you can work around the problem by setting the APR_DEEPBIND environment variable, which enables the use of the RTLD_DEEPBIND dynamic linker option when loading httpd modules. When the APR_DEEPBIND environment variable is enabled, crashes no longer occur in httpd configurations that load conflicting libraries.

(BZ#1819607)

MariaDB 10.5 does not warn about dropping a non-existent table when the OQGraph plug-in is enabled

When the OQGraph storage engine plug-in is loaded to the MariaDB 10.5 server, MariaDB does not warn about dropping a non-existent table. In particular, when the user attempts to drop a non-existent table using the DROP TABLE or DROP TABLE IF EXISTS SQL commands, MariaDB neither returns an error message nor logs a warning.

Note that the OQGraph plug-in is provided by the mariadb-oqgraph-engine package, which is not installed by default.

(BZ#1944653)

PAM plug-in version 1.0 does not work in MariaDB

MariaDB 10.3 provides the Pluggable Authentication Modules (PAM) plug-in version 1.0. MariaDB 10.5 provides the plug-in versions 1.0 and 2.0, version 2.0 is the default.

The MariaDB PAM plug-in version 1.0 does not work in RHEL 8. To work around this problem, use the PAM plug-in version 2.0 provided by the mariadb:10.5 module stream.

(BZ#1942330)

pyodbc does not work with MariaDB 10.3

The pyodbc module currently does not work with the MariaDB 10.3 server included in the RHEL 8.4 release. Earlier versions of the MariaDB 10.3 server and the MariaDB 10.5 server are not affected by this problem.

Note that the root cause is in the mariadb-connector-odbc package and the affected package versions are as follows:

pyodbc-4.0.30
mariadb-server-10.3.27
mariadb-connector-odbc-3.0.7

(BZ#1944692)

10.11. Compilers and development tools

GCC Toolset 10: Valgrind erroneously reports IBM z15 architecture support

Valgrind does not support certain IBM z15 processors features yet, but a bug in GCC Toolset 10 Valgrind causes it to report z15 support when run on a z15-capable system. As a consequence, software that tries to use z15 features when available cannot run under Valgrind. To work around this problem, when running on a z15 processor, use the system version of Valgrind accessible via /usr/bin/valgrind. This build will not report z15 support.

(BZ#1937340)

Memory leaks in pmproxy in PCP

The pmproxy service experiences memory leaks in Performance Co-Pilot (PCP) versions earlier than 5.3.0. The PCP version 5.3.0 is unavailable in RHEL 8.4 and the earlier minor versions of RHEL 8. As a consequence, RHEL 8 users might experience higher memory usage than expected.

To work around this problem, limit the memory usage of pmproxy:

Create the /etc/systemd/system/pmproxy.service.d/override.conf file by executing the following command:
```
# systemctl edit pmproxy
```
Add the following content to override.conf and save the changes:
```
[Service]
MemoryMax=10G
```
Replace the 10G value as per your requirement.
Restart the pmproxy service:
```
# systemctl restart pmproxy
```

As a result, the pmproxy service is restarted if the memory usage of pmproxy reaches the given limit.

(BZ#1991659)

10.12. Identity Management

Installing KRA fails if all KRA members are hidden replicas

The ipa-kra-install utility fails on a cluster where the Key Recovery Authority (KRA) is already present, if the first KRA instance is installed on a hidden replica. Consequently, you cannot add further KRA instances to the cluster.

To work around this problem, unhide the hidden replica that has the KRA role before you add new KRA instances. You can hide it again when ipa-kra-install completes successfully.

(BZ#1816784)

Using the cert-fix utility with the --agent-uid pkidbuser option breaks Certificate System

Using the cert-fix utility with the --agent-uid pkidbuser option corrupts the LDAP configuration of Certificate System. As a consequence, Certificate System might become unstable and manual steps are required to recover the system.

(BZ#1729215)

The /var/log/lastlog sparse file on IdM hosts can cause performance problems

During the IdM installation, a range of 200,000 UIDs from a total of 10,000 possible ranges is randomly selected and assigned. Selecting a random range in this way significantly reduces the probability of conflicting IDs in case you decide to merge two separate IdM domains in the future.

However, having high UIDs can create problems with the /var/log/lastlog file. For example, if a user with the UID of 1280000008 logs in to an IdM client, the local /var/log/lastlog file size increases to almost 400 GB. Although the actual file is sparse and does not use all that space, certain applications are not designed to identify sparse files by default and may require a specific option to handle them. For example, if the setup is complex and a backup and copy application does not handle sparse files correctly, the file is copied as if its size was 400 GB. This behavior can cause performance problems.

To work around this problem:

In case of a standard package, refer to its documentation to identify the option that handles sparse files.
In case of a custom application, ensure that it is able to manage sparse files such as /var/log/lastlog correctly.

(JIRA:RHELPLAN-59111)

FreeRADIUS silently truncates Tunnel-Passwords longer than 249 characters

If a Tunnel-Password is longer than 249 characters, the FreeRADIUS service silently truncates it. This may lead to unexpected password incompatibilities with other systems.

To work around the problem, choose a password that is 249 characters or fewer.

(BZ#1723362)

FIPS mode does not support using a shared secret to establish a cross-forest trust

Establishing a cross-forest trust using a shared secret fails in FIPS mode because NTLMSSP authentication is not FIPS-compliant. To work around this problem, authenticate with an Active Directory (AD) administrative account when establishing a trust between an IdM domain with FIPS mode enabled and an AD domain.

(BZ#1924707)

Downgrading authselect after the rebase to version 1.2.2 breaks system authentication

The authselect package has been rebased to the latest upstream version 1.2.2. Downgrading authselect is not supported and breaks system authentication for all users, including root.

If you downgraded the authselect package to 1.2.1 or earlier, perform the following steps to work around this problem:

At the GRUB boot screen, select Red Hat Enterprise Linux with the version of the kernel that you want to boot and press e to edit the entry.
Type single as a separate word at the end of the line that starts with linux and press Ctrl+x to start the boot process.
Upon booting in single-user mode, enter the root password.
Restore authselect configuration using the following command:
```
# authselect select sssd --force
```

(BZ#1892761)

Upgrading an IdM server from RHEL 8.3 to RHEL 8.4 fails if pki-ca package version is earlier than 10.10.5

The IdM server upgrade program, ipa-server-upgrade, fails if the pki-ca package version is earlier than 10.10.5. As the required files do not exist in these versions, the IdM server upgrade does not complete successfully both at package installation and when ipa-server-upgrade or ipactl are executed.

To resolve this issue, upgrade the pki-* packages to version 10.10.5 or higher and run the ipa-server-upgrade command again.

(BZ#1957768)

Potential risk when using the default value for ldap_id_use_start_tls option

When using ldap:// without TLS for identity lookups, it can pose a risk for an attack vector. Particularly a man-in-the-middle (MITM) attack which could allow an attacker to impersonate a user by altering, for example, the UID or GID of an object returned in an LDAP search.

Currently, the SSSD configuration option to enforce TLS, ldap_id_use_start_tls, defaults to false. Ensure that your setup operates in a trusted environment and decide if it is safe to use unencrypted communication for id_provider = ldap. Note id_provider = ad and id_provider = ipa are not affected as they use encrypted connections protected by SASL and GSSAPI.

If it is not safe to use unencrypted communication, enforce TLS by setting the ldap_id_use_start_tls option to true in the /etc/sssd/sssd.conf file. The default behavior is planned to be changed in a future release of RHEL.

(JIRA:RHELPLAN-155168)

SSSD retrieves incomplete list of members if the group size exceeds 1500 members

During the integration of SSSD with Active Directory, SSSD retrieves incomplete group member lists when the group size exceeds 1500 members. This issue occurs because Active Directory’s MaxValRange policy, which restricts the number of members retrievable in a single query, is set to 1500 by default.

To work around this problem, change the MaxValRange setting in Active Directory to accommodate larger group sizes.

(JIRA:RHELDOCS-19603)

10.13. Desktop

Disabling flatpak repositories from Software Repositories is not possible

Currently, it is not possible to disable or remove flatpak repositories in the Software Repositories tool in the GNOME Software utility.

(BZ#1668760)

Drag-and-drop does not work between desktop and applications

Due to a bug in the gnome-shell-extensions package, the drag-and-drop functionality does not currently work between desktop and applications. Support for this feature will be added back in a future release.

(BZ#1717947)

Generation 2 RHEL 8 virtual machines sometimes fail to boot on Hyper-V Server 2016 hosts

When using RHEL 8 as the guest operating system on a virtual machine (VM) running on a Microsoft Hyper-V Server 2016 host, the VM in some cases fails to boot and returns to the GRUB boot menu. In addition, the following error is logged in the Hyper-V event log:

The guest operating system reported that it failed with the following error code: 0x1E

This error occurs due to a UEFI firmware bug on the Hyper-V host. To work around this problem, use Hyper-V Server 2019 as the host.

(BZ#1583445)

10.14. Graphics infrastructures

radeon fails to reset hardware correctly

The radeon kernel driver currently does not reset hardware in the kexec context correctly. Instead, radeon falls over, which causes the rest of the kdump service to fail.

To work around this problem, disable radeon in kdump by adding the following line to the /etc/kdump.conf file:

dracut_args --omit-drivers "radeon"
force_rebuild 1

Restart the machine and kdump. After starting kdump, the force_rebuild 1 line may be removed from the configuration file.

Note that in this scenario, no graphics will be available during kdump, but kdump will work successfully.

(BZ#1694705)

Multiple HDR displays on a single MST topology may not power on

On systems using NVIDIA Turing GPUs with the nouveau driver, using a DisplayPort hub (such as a laptop dock) with multiple monitors which support HDR plugged into it may result in failure to turn on. This is due to the system erroneously thinking there is not enough bandwidth on the hub to support all of the displays.

(BZ#1812577)

Unable to run graphical applications using sudo command

When trying to run graphical applications as a user with elevated privileges, the application fails to open with an error message. The failure happens because Xwayland is restricted by the Xauthority file to use regular user credentials for authentication.

To work around this problem, use the sudo -E command to run graphical applications as a root user.

(BZ#1673073)

VNC Viewer displays wrong colors with the 16-bit color depth on IBM Z

The VNC Viewer application displays wrong colors when you connect to a VNC session on an IBM Z server with the 16-bit color depth.

To work around the problem, set the 24-bit color depth on the VNC server. With the Xvnc server, replace the -depth 16 option with -depth 24 in the Xvnc configuration.

As a result, VNC clients display the correct colors but use more network bandwidth with the server.

(BZ#1886147)

Hardware acceleration is not supported on ARM

Built-in graphics drivers do not support hardware acceleration or the Vulkan API on the 64-bit ARM architecture.

To enable hardware acceleration or Vulkan on ARM, install the proprietary Nvidia driver.

(JIRA:RHELPLAN-57914)

GUI in ESXi might crash due to low video memory

The graphical user interface (GUI) on RHEL virtual machines (VMs) in the VMware ESXi 7.0.1 hypervisor with vCenter Server 7.0.1 requires a certain amount of video memory. If you connect multiple consoles or high-resolution monitors to the VM, the GUI requires least 16 MB of video memory. If you start the GUI with less video memory, the GUI might terminate unexpectedly.

To work around the problem, configure the hypervisor to assign at least 16 MB of video memory to the VM. As a result, the GUI on the VM no longer crashes.

(BZ#1910358)

10.15. Virtualization

virsh iface-\* commands do not work consistently

Currently, virsh iface-* commands, such as virsh iface-start and virsh iface-destroy, frequently fail due to configuration dependencies. Therefore, it is recommended not to use virsh iface-\* commands for configuring and managing host network connections. Instead, use the NetworkManager program and its related management applications.

(BZ#1664592)

Virtual machines sometimes fail to start when using many virtio-blk disks

Adding a large number of virtio-blk devices to a virtual machine (VM) may exhaust the number of interrupt vectors available in the platform. If this occurs, the VM’s guest OS fails to boot, and displays a dracut-initqueue[392]: Warning: Could not boot error.

(BZ#1719687)

Attaching LUN devices to virtual machines using virtio-blk does not work

The q35 machine type does not support transitional virtio 1.0 devices, and RHEL 8 therefore lacks support for features that were deprecated in virtio 1.0. In particular, it is not possible on a RHEL 8 host to send SCSI commands from virtio-blk devices. As a consequence, attaching a physical disk as a LUN device to a virtual machine fails when using the virtio-blk controller.

Note that physical disks can still be passed through to the guest operating system, but they should be configured with the device='disk' option rather than device='lun'.

(BZ#1777138)

Virtual machines using Cooperlake cannot boot when TSX is disabled on the host

Virtual machines (VMs) that use the Cooperlake CPU model currently fail to boot when the TSX CPU flag is diabled on the host. Instead, the host displays the following error message:

the CPU is incompatible with host CPU: Host CPU does not provide required features: hle, rtm

To make VMs with Cooperlake usable on such host, disable the HLE, RTM, and TAA_NO flags in the VM configuration in the VM’s XML configuration:

<feature policy='disable' name='hle'/>
<feature policy='disable' name='rtm'/>
<feature policy='disable' name='taa-no'/>

(BZ#1860743)

Using perf kvm record on IBM POWER Systems can cause the VM to crash

When using a RHEL 8 host on the little-endian variant of IBM POWER hardware, using the perf kvm record command to collect trace event samples for a KVM virtual machine (VM) in some cases results in the VM becoming unresponsive. This situation occurs when:

The perf utility is used by an unprivileged user, and the -p option is used to identify the VM - for example perf kvm record -e trace_cycles -p 12345.
The VM was started using the virsh shell.

To work around this problem, use the perf kvm utility with the -i option to monitor VMs that were created using the virsh shell. For example:

# perf kvm record -e trace_imc/trace_cycles/  -p <guest pid> -i

Note that when using the -i option, child tasks do not inherit counters, and threads will therefore not be monitored.

(BZ#1924016)

Migrating a POWER9 guest from a RHEL 7-ALT host to RHEL 8 fails

Currently, migrating a POWER9 virtual machine from a RHEL 7-ALT host system to RHEL 8 becomes unresponsive with a Migration status: active status.

To work around this problem, disable Transparent Huge Pages (THP) on the RHEL 7-ALT host, which enables the migration to complete successfully.

(BZ#1741436)

Using virt-customize sometimes causes guestfs-firstboot to fail

After modifying a virtual machine (VM) disk image using the virt-customize utility, the guestfs-firstboot service in some cases fails due to incorrect SELinux permissions. This causes a variety of problems during VM startup, such as failing user creation or system registration.

To avoid this problem, add the --selinux-relabel option to the virt-customize command.

(BZ#1554735)

Virtual machines with iommu_platform=on fail to start on IBM POWER

RHEL 8 currently does not support the iommu_platform=on parameter for virtual machines (VMs) on IBM POWER system. As a consequence, starting a VM with this parameter on IBM POWER hardware results in the VM becoming unresponsive during the boot process.

(BZ#1910848)

SMT CPU topology is not detected by VMs when using host passthrough mode on AMD EPYC

When a virtual machine (VM) boots with the CPU host passthrough mode on an AMD EPYC host, the TOPOEXT CPU feature flag is not present. Consequently, the VM is not able to detect a virtual CPU topology with multiple threads per core. To work around this problem, boot the VM with the EPYC CPU model instead of host passthrough.

(BZ#1740002)

Windows Server 2016 virtual machines with Hyper-V enabled fail to boot when using certain CPU models

Currently, it is not possible to boot a virtual machine (VM) that uses Windows Server 2016 as the guest operating system, has the Hyper-V role enabled, and uses one of the following CPU models:

EPYC-IBPB
EPYC

To work around this problem, use the EPYC-v3 CPU model, or manually enable the xsaves CPU flag for the VM.

(BZ#1942888)

Deleting a macvtap interface from a virtual machine resets all macvtap connections

Currently, deleting a macvtap interface from a running virtual machines (VM) with multiple macvtap devices also resets the connection settings of the other macvtap interfaces. As a consequence, the VM may experience network issues.

(BZ#1332758)

Hot unplugging an IBMVFC device on PowerVM fails

When using a virtual machine (VM) with a RHEL 8 guest operating system on the PowerVM hypervisor, attempting to remove an IBM Power Virtual Fibre Channel (IBMVFC) device from the running VM currently fails. Instead, it displays an outstanding translation error.

To work around this problem, remove the IBMVFC device when the VM is shut down.

(BZ#1959020)

IBM POWER hosts may crash when using the ibmvfc driver

When running RHEL 8 on a PowerVM logical partition (LPAR), a variety of errors may currently occur due problems with the ibmvfc driver. As a consequence, the host’s kernel may panic under certain circumstances, such as:

Using the Live Partition Mobility (LPM) feature
Resetting a host adapter
Using SCSI error handling (SCSI EH) functions

(BZ#1961722)

Mounting virtiofs directories fails in certain circumstances on RHEL 8 guests

Currently, when using the virtiofs feature to provide a host directory to a virtual machine (VM), mounting the directory on the VM fails with an "Operation not supported" error if the VM is using a RHEL 8.4 (or earlier) kernel but a RHEL 8.5 (or later) selinux-policy package.

To work around this problem, reboot the guest and boot it into the latest available kernel on the guest.

(BZ#1995558)

10.16. RHEL in cloud environments

kdump sometimes does not start on Azure and Hyper-V

On RHEL 8 guest operating systems hosted on the Microsoft Azure or Hyper-V hypervisors, starting the kdump kernel in some cases fails when post-exec notifiers are enabled.

To work around this problem, disable crash kexec post notifiers:

# echo N > /sys/module/kernel/parameters/crash_kexec_post_notifiers

(BZ#1865745)

Setting static IP in a RHEL 8 virtual machine on a VMWare host does not work

Currently, when using RHEL 8 as a guest operating system of a virtual machine (VM) on a VMWare host, the DatasourceOVF function does not work correctly. As a consequence, if you use the cloud-init utility to set the VM’s network to static IP and then reboot the VM, the VM’s network will be changed to DHCP.

(BZ#1750862)

Core dumping RHEL 8 virtual machines with certain NICs to a remote machine on Azure takes longer than expected

Currently, using the kdump utility to save the core dump file of a RHEL 8 virtual machine (VM) on a Microsoft Azure hypervisor to a remote machine does not work correctly when the VM is using a NIC with enabled accelerated networking. As a consequence, the dump file is saved after approximately 200 seconds, instead of immediately. In addition, the following error message is logged on the console before the dump file is saved.

device (eth0): linklocal6: DAD failed for an EUI-64 address

(BZ#1854037)

The nm-cloud-setup utility sets an incorrect default route on Microsoft Azure

On Microsoft Azure, the nm-cloud-setup utility fails to detect the correct gateway of the cloud environment. As a consequence, the utility sets an incorrect default route, and breaks connectivity. There is no workaround available at the moment.

(BZ#1912236)

The SCSI host address sometimes changes when booting a Hyper-V VM with multiple guest disks

Currently, when booting a RHEL 8 virtual machine (VM) on the Hyper-V hypervisor, the host portion of the Host, Bus, Target, Lun (HBTL) SCSI address in some cases changes. As a consequence, automated tasks set up with the HBTL SCSI identification or device node in the VM do not work consistently. This occurs if the VM has more than one disk or if the disks have different sizes.

To work around the problem, modify your kickstart files, using one of the following methods:

Method 1: Use persistent identifiers for SCSI devices.

You can use for example the following powershell script to determine the specific device identifiers:

# Output what the /dev/disk/by-id/<value> for the specified hyper-v virtual disk.
# Takes a single parameter which is the virtual disk file.
# Note: kickstart syntax works with and without the /dev/ prefix.
param (
    [Parameter(Mandatory=$true)][string]$virtualdisk
)

$what = Get-VHD -Path $virtualdisk
$part = $what.DiskIdentifier.ToLower().split('-')

$p = $part[0]
$s0 = $p[6] + $p[7] + $p[4] + $p[5] + $p[2] + $p[3] + $p[0] + $p[1]

$p = $part[1]
$s1 =  $p[2] + $p[3] + $p[0] + $p[1]

[string]::format("/dev/disk/by-id/wwn-0x60022480{0}{1}{2}", $s0, $s1, $part[4])

You can use this script on the hyper-v host, for example as follows:

PS C:\Users\Public\Documents\Hyper-V\Virtual hard disks> .\by-id.ps1 .\Testing_8\disk_3_8.vhdx
/dev/disk/by-id/wwn-0x60022480e00bc367d7fd902e8bf0d3b4
PS C:\Users\Public\Documents\Hyper-V\Virtual hard disks> .\by-id.ps1 .\Testing_8\disk_3_9.vhdx
/dev/disk/by-id/wwn-0x600224807270e09717645b1890f8a9a2

Afterwards, the disk values can be used in the kickstart file, for example as follows:

part / --fstype=xfs --grow --asprimary --size=8192 --ondisk=/dev/disk/by-id/wwn-0x600224807270e09717645b1890f8a9a2
part /home --fstype="xfs" --grow --ondisk=/dev/disk/by-id/wwn-0x60022480e00bc367d7fd902e8bf0d3b4

As these values are specific for each virtual disk, the configuration needs to be done for each VM instance. It may, therefore, be useful to use the %include syntax to place the disk information into a separate file.

Method 2: Set up device selection by size.

A kickstart file that configures disk selection based on size must include lines similar to the following:

...

# Disk partitioning information is supplied in a file to kick start
%include /tmp/disks

...

# Partition information is created during install using the %pre section
%pre --interpreter /bin/bash --log /tmp/ks_pre.log

	# Dump whole SCSI/IDE disks out sorted from smallest to largest ouputting
	# just the name
	disks=(`lsblk -n -o NAME -l -b -x SIZE -d -I 8,3`) || exit 1

	# We are assuming we have 3 disks which will be used
	# and we will create some variables to represent
	d0=${disks[0]}
	d1=${disks[1]}
	d2=${disks[2]}

	echo "part /home --fstype="xfs" --ondisk=$d2 --grow" >> /tmp/disks
	echo "part swap --fstype="swap" --ondisk=$d0 --size=4096" >> /tmp/disks
	echo "part / --fstype="xfs" --ondisk=$d1 --grow" >> /tmp/disks
	echo "part /boot --fstype="xfs" --ondisk=$d1 --size=1024" >> /tmp/disks

%end

(BZ#1906870)

RHEL 8 virtual machines have lower network performance on AWS ARM64 instances

When using RHEL 8 as a guest operating system in a virtual machine (VM) that runs on an Amazon Web Services (AWS) ARM64 instance, the VM has lower than expected network performance when the iommu.strict=1 kernel parameter is used or when no iommu.strict parameter is defined.

To work around this problem, change the parameter to iommu.strict=0. However, this can also decrease the security of the VM.

(BZ#1836058)

Hibernating RHEL 8 guests fails when FIPS mode is enabled

Currently, it is not possible to hibernate a virtual machine (VM) that uses RHEL 8 as its guest operating system if the VM is using FIPS mode.

(BZ#1934033, BZ#1944636)

SSH keys are not generated correctly on EC2 instanced created from a backup AMI

Currently, when creating a new Amazon EC2 instance of RHEL 8 from a backup Amazon Machine Image (AMI), cloud-init deletes existing SSH keys on the VM but does not create new ones. Consequently, the VM in some cases cannot connect to the host.

To work around this problem, edit the cloud.cgf file and change the "ssh_genkeytypes: ~" line to ssh_genkeytypes: ['rsa', 'ecdsa', 'ed25519'].

This makes it possible for SSH keys to be deleted and generated correctly when provisioning a RHEL 8 VM in the described circumstances.

(BZ#1957532)

SSH keys are not generated correctly on EC2 instanced created from a backup AMI

To work around this problem, edit the cloud.cgf file and change the "ssh_genkeytypes: ~" line to ssh_genkeytypes: ['rsa', 'ecdsa', 'ed25519'].

This makes it possible for SSH keys to be deleted and generated correctly when provisioning a RHEL 8 VM in the described circumstances.

(BZ#1963981)

10.17. Supportability

redhat-support-tool does not work with the FUTURE crypto policy

Because a cryptographic key used by a certificate on the Customer Portal API does not meet the requirements by the FUTURE system-wide cryptographic policy, the redhat-support-tool utility does not work with this policy level at the moment.

To work around this problem, use the DEFAULT crypto policy while connecting to the Customer Portal API.

(BZ#1802026)