Exec probes fail clusterwide after upgrade to cri-o-1.19.2-4 in Red Hat OpenShift Container Platform 4.x

Solution In Progress - Updated 2024-06-13T22:51:34+00:00 -

Issue

After upgrading the cluster, readiness and liveness probes cluster wide (for containers on the RHEL worker nodes) seemingly randomly fail a lot with timeouts.

Seemingly innocuous probes like this one here:

    name: service
    readinessProbe:
      exec:
        command:
        - cat
        - /etc/hosts
      failureThreshold: 3
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 1

Time out with messages such as:

Jun 30 20:17:59 host hyperkube[1887]: I0630 20:17:59.805952    1887 prober.go:117] Liveness probe for "service(uuid):service" failed (failure): command timed out
Jun 30 20:17:59 host hyperkube[1887]: I0630 20:17:59.806074    1887 event.go:291] "Event occurred" object="namespace/service" kind="Pod" apiVersion="v1" type="Warning" reason="Unhealthy" message="Liveness probe failed: command timed out"

Environment

Red Hat OpenShift Container Platform 4.x
cri-o-1.19.2-4, cri-o-1.19.2-6

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Select Your Language

Exec probes fail clusterwide after upgrade to cri-o-1.19.2-4 in Red Hat OpenShift Container Platform 4.x

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links