Openshift Virtualization pods fail with error UnexpectedAdmissionError during CRI-O restart

Solution Verified - Updated -

Environment

  • Red Hat Openshift Container Platform (RHOCP)

    • 4.12
    • 4.13

Issue

  • CRI-O restart shouldn't have any effect on running pods
  • Openshift Virtualization VirtualMachine pods are failing during CRI-O restart and getting recreated
  • The pods fail with error UnexpectedAdmissionError

Resolution

  • Fix was introduced in upstream kubernetes
  • Upgrade to the z stream version or later version, that includes the fix:
    • 4.12.39
    • 4.13.14

Root Cause

  • The issue is related to a bug in kubernetes with workloads using device plugin
  • If kubelet is restarted on a node (relevant also for CRI-O restart), then all the existing and running workloads that use devices are terminated with UnexpectedAdmissionError
  • KubeVirt runs virtual machines inside pods and uses a device plugin /dev/kvm to advertise on the nodes
  • For more details on the upstream fix for kubernetes 1.25 you can refer to the related PR
  • Please note that any pod that uses a device plugin will fail and not only KubeVirt pods

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments