Mitigation for CVE-2026-31431 ("Copy Fail") in Azure Red Hat OpenShift
Environment
- Azure Red Hat OpenShift
- 4.x
Issue
- Azure Red Hat OpenShift clusters are confirmed as affected by CVE-2026-31431 which has been classified as an important vulnerability.
Resolution
- Red Hat has released OpenShift updates that include patched kernels. The following versions contain the fix:
| OpenShift release | Fixed version | Errata |
|---|---|---|
| 4.21 | 4.21.14 | RHSA-2026:13811 |
| 4.20 | 4.20.21 | RHSA-2026:13862 |
| 4.19 | 4.19.30 | RHSA-2026:13690 |
| 4.18 | 4.18.40 | RHSA-2026:13727 |
| 4.16 | 4.16.61 | RHSA-2026:13729 |
NOTE:
- You should upgrade your cluster to an OpenShift release that includes the fix (see version list above).
- For clusters that cannot be immediately upgraded, a mitigation is available as detailed below
This mitigation should not be applied to clusters where workload relies AF_ALG socket access to kernel AEAD algorithms. This is not common, but can be used in hardware crypto acceleration use cases. There are currently no other known side effects of this mitigation strategy.
For managed clusters:
- Control plane nodes: The mitigation has been applied to them by ARO SREs.
- Worker and Infra nodes will need to be updated to apply the mitigation. There are two options for applying the mitigation. Either:
- Modify the
MachineConfigto add a kernel command line argument to disable the algif-aead module. Once this configuration is applied the worker nodes will reboot sequentially. This procedure is documented in this article below. Or, - You can implement a zero-reboot remediation by utilizing a BPF LSM DaemonSet. This process blocks all AF_ALG AEAD binds. Instructions for this procedure are available at the following Knowledge base article.
- Modify the
Mitigation Steps with Machine Config
Attention: This will cause a rolling reboot of all nodes targeted by the MachineConfig!
Note: any PodDisruptionBudget specifications on worker nodes may interfere with node reboots.
One way to mitigate this for now is to block the algif_aead module. This can be done via an update to MachineConfig.
Prerequisites
Confirm that you have the required permissions to create a MachineConfig resource.
oc auth whoami -ojsonpath="{.status.userInfo.groups}"
["system:cluster-admins","system:authenticated"]
The resulting output must have the cluster-admins role.
Procedure
- Apply the
MachineConfig
oc apply -f - <<'EOF'
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: worker
name: 99-worker-disable-algif-aead
spec:
kernelArguments:
- initcall_blacklist=algif_aead_init
EOF
This will add the following: initcall_blacklist=algif_aead_init to the boot cmd line.
- Watch the
MachineConfigPooluntil it finishes rolling out. All workers will reboot. Wait forUPDATED=TrueandDEGRADED=False.
oc wait mcp/worker --for=condition=Updated=True --timeout=30m
Alternatively, you can also watch progress interactively:
oc get mcp worker -w
- Select a worker node to verify.
WORKER_NODE=$(oc get nodes -l node-role.kubernetes.io/worker -o jsonpath='{.items[0].metadata.name}')
- Check that the kernel argument is present on the boot command line.
oc debug node/"$WORKER_NODE" -- chroot /host cat /proc/cmdline | grep initcall_blacklist
Example output:
BOOT_IMAGE=(hd0,gpt1)/ostree/rhcos-58db33956f03d7d30d356cce4dab4efc705c658bd38b1f3fdf9d31150e69a479/vmlinuz-5.14.0-570.60.1.el9_6.x86_64 rhcos.root=crypt_rootfs random.trust_cpu=on console=tty0 console=ttyS0,115200n8 rd.luks.options=discard ostree=/ostree/boot.1/rhcos/58db33956f03d7d30d356cce4dab4efc705c658bd38b1f3fdf9d31150e69a479/0 ignition.platform.id=aws systemd.unified_cgroup_hierarchy=1 cgroup_no_v1=all psi=0 initcall_blacklist=algif_aead_init
- Check
dmesgfor confirmation theinitcallwas blacklisted.
oc debug node/"$WORKER_NODE" -- chroot /host dmesg | grep -i "blacklisted"
Example output:
[ 2.084817] initcall algif_aead_init blacklisted
Root Cause
A flaw was found in the Linux kernel's algif_aead cryptographic algorithm interface. An incorrect 'in-place operation' was introduced, where the source and destination data mappings were different. This could lead to unexpected behavior or data integrity issues during cryptographic operations, potentially impacting the reliability of encrypted communications.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments