Mitigation for CVE-2026-31431 ("Copy Fail") in Azure Red Hat OpenShift

Solution Verified - Updated -

Environment

  • Azure Red Hat OpenShift
    • 4.x

Issue

  • Azure Red Hat OpenShift clusters are confirmed as affected by CVE-2026-31431 which has been classified as an important vulnerability.

Resolution

  • Red Hat has released OpenShift updates that include patched kernels. The following versions contain the fix:
OpenShift release Fixed version Errata
4.21 4.21.14 RHSA-2026:13811
4.20 4.20.21 RHSA-2026:13862
4.19 4.19.30 RHSA-2026:13690
4.18 4.18.40 RHSA-2026:13727
4.16 4.16.61 RHSA-2026:13729

NOTE:
- You should upgrade your cluster to an OpenShift release that includes the fix (see version list above).
- For clusters that cannot be immediately upgraded, a mitigation is available as detailed below

This mitigation should not be applied to clusters where workload relies AF_ALG socket access to kernel AEAD algorithms. This is not common, but can be used in hardware crypto acceleration use cases. There are currently no other known side effects of this mitigation strategy.

For managed clusters:

  • Control plane nodes: The mitigation has been applied to them by ARO SREs.
  • Worker and Infra nodes will need to be updated to apply the mitigation. There are two options for applying the mitigation. Either:
    • Modify the MachineConfig to add a kernel command line argument to disable the algif-aead module. Once this configuration is applied the worker nodes will reboot sequentially. This procedure is documented in this article below. Or,
    • You can implement a zero-reboot remediation by utilizing a BPF LSM DaemonSet. This process blocks all AF_ALG AEAD binds. Instructions for this procedure are available at the following Knowledge base article.

Mitigation Steps with Machine Config

Attention: This will cause a rolling reboot of all nodes targeted by the MachineConfig!

Note: any PodDisruptionBudget specifications on worker nodes may interfere with node reboots.

One way to mitigate this for now is to block the algif_aead module. This can be done via an update to MachineConfig.

Prerequisites

Confirm that you have the required permissions to create a MachineConfig resource.

oc auth whoami -ojsonpath="{.status.userInfo.groups}"
["system:cluster-admins","system:authenticated"]

The resulting output must have the cluster-admins role.

Procedure

  1. Apply the MachineConfig
oc apply -f - <<'EOF'
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 99-worker-disable-algif-aead
spec:
  kernelArguments:
    - initcall_blacklist=algif_aead_init
EOF

This will add the following: initcall_blacklist=algif_aead_init to the boot cmd line.

  1. Watch the MachineConfigPool until it finishes rolling out. All workers will reboot. Wait for UPDATED=True and DEGRADED=False.
oc wait mcp/worker --for=condition=Updated=True --timeout=30m

Alternatively, you can also watch progress interactively:

oc get mcp worker -w
  1. Select a worker node to verify.
WORKER_NODE=$(oc get nodes -l node-role.kubernetes.io/worker -o jsonpath='{.items[0].metadata.name}')
  1. Check that the kernel argument is present on the boot command line.
oc debug node/"$WORKER_NODE" -- chroot /host cat /proc/cmdline | grep initcall_blacklist

Example output:

BOOT_IMAGE=(hd0,gpt1)/ostree/rhcos-58db33956f03d7d30d356cce4dab4efc705c658bd38b1f3fdf9d31150e69a479/vmlinuz-5.14.0-570.60.1.el9_6.x86_64 rhcos.root=crypt_rootfs random.trust_cpu=on console=tty0 console=ttyS0,115200n8 rd.luks.options=discard ostree=/ostree/boot.1/rhcos/58db33956f03d7d30d356cce4dab4efc705c658bd38b1f3fdf9d31150e69a479/0 ignition.platform.id=aws systemd.unified_cgroup_hierarchy=1 cgroup_no_v1=all psi=0 initcall_blacklist=algif_aead_init
  1. Check dmesg for confirmation the initcall was blacklisted.
oc debug node/"$WORKER_NODE" -- chroot /host dmesg | grep -i "blacklisted"

Example output:

[    2.084817] initcall algif_aead_init blacklisted

Root Cause

A flaw was found in the Linux kernel's algif_aead cryptographic algorithm interface. An incorrect 'in-place operation' was introduced, where the source and destination data mappings were different. This could lead to unexpected behavior or data integrity issues during cryptographic operations, potentially impacting the reliability of encrypted communications.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments