Cluster not accessible; ovnkube-node POD on a node is not running

Solution Verified - Updated -

Issue

  • The ovnkube-node POD on a node is not running and complaining about the invalid certificates:

    # oc logs ovnkube-node -n openshift-ovn-kubernetes -c ovnkube-controller 
    Error from server: Get "https://10.58.212.40:10250/containerLogs/openshift-ovn-kubernetes/ovnkube-node-2sbtx/ovnkube-controller": tls: failed to verify certificate: x509: certificate signed by unknown authority
    
  • OpenShift cluster is not accessible through WebUI and CLI.

  • Several cluster operators are degraded:

    Operator: 'authentication'
    Issue          : Degraded
    Reason         : APIServerDeployment_UnavailablePod::OAuthServerDeployment_UnavailablePod
    Message        : APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for
                     apiserver.openshift-oauth-apiserver (2 containers are waiting in pending
                     apiserver-7c9cb77b9d-rn7ts pod)
                     OAuthServerDeploymentDegraded: 1 of 3 requested instances are unavailable for
                     oauth-openshift.openshift-authentication (container is waiting in pending
                     oauth-openshift-64ff8c5c65-mt5pc pod)
    LastTransition : 2025-02-05T00:38:05Z
    Issue          : Progressing
    Reason         : OAuthServerDeployment_PodsUpdating
    Message        : OAuthServerDeploymentProgressing: deployment/oauth-openshift.openshift-
                     authentication: 1/3 pods have been updated to the latest generation
    LastTransition : 2025-02-27T11:01:48Z
    Operator: 'dns'
    Issue          : Progressing
    Reason         : DNSReportsProgressingIsTrue
    Message        : DNS "default" reports Progressing=True: "Have 7 available DNS pods, want 8."
    LastTransition : 2025-02-27T11:29:52Z
    Operator: 'kube-apiserver'
    Issue          : Degraded
    Reason         : InstallerPodContainerWaiting_ContainerCreating::InstallerPodNetworking_FailedCreatePodSandBox::NodeInstaller_InstallerPodFailed
    Message        : InstallerPodContainerWaitingDegraded: Pod "installer-86-retry-1866-
                     nodeName" on node
                     "nodeName" container
                     "installer" is waiting since 2025-02-27 12:23:34 +0000 UTC because
                     ContainerCreating
                     InstallerPodNetworkingDegraded: Pod "installer-86-retry-1866-
                     nodeName" on node
                     "nodeName" observed degraded
                     networking: (combined from similar events): Failed to create pod sandbox: rpc
                     error: code = Unknown desc = failed to create pod network sandbox k8s_installe
                     r-86-retry-1866-
                     nodeName_openshift-kube-apiserv
                     er_a48ba0a8-ad70-4ba7-97a5-
                     f9a52d897bc5_0(edfc35169299c2645fc3f2ad9eb4aead328719ca4eec4b4d4b9fda5e5414885
                     7): error adding pod openshift-kube-apiserver_installer-86-retry-1866-
                     nodeName to CNI network
                     "multus-cni-network": plugin type="multus-shim" name="multus-cni-network"
                     failed (add): CmdAdd (shim): failed to send CNI request: Post
                     "http://dummy/cni": dial unix /run/multus/socket/multus.sock: connect: no such
                     file or directory
                     NodeInstallerDegraded: 1 nodes are failing on revision 86:
                     NodeInstallerDegraded: installer: esources",
                     NodeInstallerDegraded:  PodManifestDir: (string) (len=25)
                     "/etc/kubernetes/manifests",
                     NodeInstallerDegraded:  Timeout: (time.Duration) 2m0s,
                     NodeInstallerDegraded:  StaticPodManifestsLockFile: (string) "",
                     NodeInstallerDegraded:  PodMutationFns: ([]installerpod.PodMutationFunc)
                     <nil>,
                     NodeInstallerDegraded:  KubeletVersion: (string) ""
                     NodeInstallerDegraded: })
                     NodeInstallerDegraded: I0225 07:56:12.318253       1 cmd.go:410] Getting
                     controller reference for node
                    nodeName
                     NodeInstallerDegraded: W0225 07:57:08.666583       1 cmd.go:420] unable to get
                     owner reference (falling back to namespace): Get
                     "https://172.29.0.1:443/api/v1/namespaces/openshift-kube-apiserver/pods/instal
                     ler-86-retry-1865-
                     nodeName?timeout=14s":
                     net/http: request canceled while waiting for connection (Client.Timeout
                     exceeded while awaiting headers)
                     NodeInstallerDegraded: I0225 07:57:08.666676       1 cmd.go:423] Waiting for
                     installer revisions to settle for node
                     nodeName
                     NodeInstallerDegraded: W0225 07:57:22.667869       1 cmd.go:467] Error getting
                     installer pods on current node
                     nodeName: Get
                     "https://172.29.0.1:443/api/v1/namespaces/openshift-kube-
                     apiserver/pods?labelSelector=app%3Dinstaller": net/http: request canceled
                     while waiting for connection (Client.Timeout exceeded while awaiting headers)
                     NodeInstallerDegraded: W0225 07:57:46.672390       1 cmd.go:467] Error getting
                     installer pods on current node
                     nodeName: Get
                     "https://172.29.0.1:443/api/v1/namespaces/openshift-kube-
                     apiserver/pods?labelSelector=app%3Dinstaller": net/http: request canceled
                     while waiting for connection (Client.Timeout exceeded while awaiting headers)
                     NodeInstallerDegraded: W0225 07:58:06.671409       1 cmd.go:467] Error getting
                     installer pods on current node
                     nodeName: Get
                     "https://172.29.0.1:443/api/v1/namespaces/openshift-kube-
                     apiserver/pods?labelSelector=app%3Dinstaller": net/http: request canceled
                     while waiting for connection (Client.Timeout exceeded while awaiting headers)
                     NodeInstallerDegraded: F0225 07:58:12.319792       1 cmd.go:106] timed out
                     waiting for the condition
                     NodeInstallerDegraded:
    LastTransition : 2025-02-06T20:55:20Z
    Issue          : Progressing
    Reason         : NodeInstaller
    Message        : NodeInstallerProgressing: 3 nodes are at revision 80; 0 nodes have achieved
                     new revision 86
    LastTransition : 2025-02-06T20:50:09Z
    Operator: 'network'
    Issue          : Degraded
    Reason         : RolloutHung
    Message        : DaemonSet "/openshift-ovn-kubernetes/ovnkube-node" rollout is not making
                     progress - last change 2025-02-27T13:44:17Z
    LastTransition : 2025-02-27T13:55:37Z
    Issue          : Progressing
    Reason         : Deploying
    Message        : DaemonSet "/openshift-ovn-kubernetes/ovnkube-node" is not available (awaiting
                     1 nodes)
                     DaemonSet "/openshift-multus/multus" is not available (awaiting 1 nodes)
                     DaemonSet "/openshift-multus/network-metrics-daemon" is not available
                     (awaiting 1 nodes)
                     DaemonSet "/openshift-network-diagnostics/network-check-target" is not
                     available (awaiting 1 nodes)
    LastTransition : 2025-02-25T07:59:33Z
    Operator: 'openshift-apiserver'
    Issue          : Degraded
    Reason         : APIServerDeployment_UnavailablePod
    Message        : APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for
                     apiserver.openshift-apiserver (3 containers are waiting in pending
                     apiserver-7b978984db-rjrff pod)
    LastTransition : 2025-02-26T15:27:37Z
    
    Issue          : Progressing
    Reason         : APIServerDeployment_PodsUpdating
    Message        : APIServerDeploymentProgressing: deployment/apiserver.openshift-apiserver: 1/3
                     pods have been updated to the latest generation
    LastTransition : 2025-02-26T14:42:10Z
    
  • The br-ex interface is missing on the problematic node.

Environment

  • Red Hat OpenShift Container Platform
    • v4.x
  • OVNKubernetes

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content