OpenShift IPI installer is completing with return code 0, even though some worker nodes are still missing

Solution In Progress - Updated -

Environment

  • Red Hat OpenShift Container Platform (RHOCP) 4

Issue

  • openshift-install with installer-provisioned infrastructure (IPI) is completing with return code 0, even though the number of worker nodes from oc get nodes still doesn't match the one in the compute.replicas field of install-config.yaml
  • openshift-install with installer-provisioned infrastructure (IPI) is completing with return code 0, even though there are still some worker Machine objects that are in Provisioned phase
  • openshift-install with installer-provisioned infrastructure (IPI) on bare metal is completing with return code 0, even though there are still some worker BareMetalHost objects that are in inspecting or provisioning state

Resolution

  • The expected state of the cluster when the installation completes is only that the Control Plane is up and all Cluster Operators healthy, that is the cluster is ready for Day-2 operations
  • The cluster installation is therefore considered successful even if one or more worker nodes are unhealthy; the assessment of their status is considered a Day-2 operation
  • Note that there might be some time gap between the conclusion of openshift-install and the worker nodes becoming healthy. This gap depends on the time required by the deployment of the worker nodes on the specific infrastructure where the cluster is being installed; before considering a worker node faulty, a wait of 40 minutes (60 minutes in case of a bare metal environment) after the conclusion of openshift-install should be considered normal
  • To perform a worker nodes assessment:
    1. (only if the installation is on bare metal) run the command oc get bmh -n openshift-machine-api and check that all worker BareMetalHost objects are in provisioned state
    2. run the command oc get machines -n openshift-machine-api and check that all worker Machine objects are in Running phase
    3. run the command oc get nodes -n openshift-machine-api and check that the number of worker Node objects in Ready state matches the desired value
  • As of this writing, there is an open Bugzilla ticket about the OpenShift installation documentation, in order to clarify the conditions which make openshift-install consider a new cluster installation successful and to provide more instructions about how to check that all worker nodes of a new cluster are healthy

Root Cause

  • In order to assess the cluster health, openshift-install checks:
    • the status of the api object ClusterVersion, which represents a "summary" of all the ClusterOperator objects statuses
    • if the OpenShift Console is accessible from the provisioning host
  • Cluster Operators like Machine Api Operator and Baremetal Operator only become degraded when they can't perform their own functions; that is, Machines / BareMetalHosts that are not in Running phase / provisioned state don't constitute a failure for the Machine Api Operator / Baremetal Operator
    • the operators were designed this way so that their behaviors preserve the general approach of Kubernetes controllers: they run a continuously reconciliation loop on the hosts, so that if there is a problem it is not considered permanent, but an administrator can fix it and then the operator will try again and succeed
  • Therefore, no degraded status is passed from Machine Api Operator / Baremetal Operator to the ClusterVersion object status and then to openshift-install

Diagnostic Steps

  • Check that openshift-install is reporting a successful cluster installation:
...
time="2022-05-05T15:47:42Z" level=info msg="Install complete!"
time="2022-05-05T15:47:42Z" level=info msg="To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/home/example/auth/kubeconfig'"
time="2022-05-05T15:47:42Z" level=info msg="Access the OpenShift web-console here: https://console-openshift-console.apps.example.com
time="2022-05-05T15:47:42Z" level=info msg="Login to the console with user: \"kubeadmin\", and password: \"xxxxx-xxxxx-xxxxx-xxxxx\""
time="2022-05-05T15:47:42Z" level=debug msg="Time elapsed per stage:"
time="2022-05-05T15:47:42Z" level=debug msg="    Infrastructure: 31m47s"
time="2022-05-05T15:47:42Z" level=debug msg="Bootstrap Complete: 24m55s"
time="2022-05-05T15:47:42Z" level=debug msg=" Bootstrap Destroy: 13s"
time="2022-05-05T15:47:42Z" level=debug msg=" Cluster Operators: 51m23s"
time="2022-05-05T15:47:42Z" level=info msg="Time elapsed: 1h48m25s"
  • Check if oc get machines -A is showing any worker Machine object which is still in the Provisioned phase:
NAMESPACE               NAME                           PHASE         TYPE   REGION   ZONE   AGE
openshift-machine-api   example-zbbt6-master-0         Running                              95m
openshift-machine-api   example-zbbt6-master-1         Running                              95m
openshift-machine-api   example-zbbt6-master-2         Running                              95m
openshift-machine-api   example-zbbt6-worker-0-25bhp   Running                              49m
openshift-machine-api   example-zbbt6-worker-0-8b4c2   Running                              49m
openshift-machine-api   example-zbbt6-worker-0-jkbqt   Provisioned                          49m
openshift-machine-api   example-zbbt6-worker-0-qrl5b   Running                              49m
  • If the installation is on bare metal, check if oc get bmh -A is showing any worker BareMetalHost object which is still in the inspecting or provisioning state:
NAMESPACE               NAME              STATE         CONSUMER                      ONLINE 
openshift-machine-api   foobar-compute1   inspecting                                  true   
openshift-machine-api   foobar-compute2   provisioned   foobar-tlms8-worker-0-j9x86   true
openshift-machine-api   foobar-compute3   inspecting                                  true
openshift-machine-api   foobar-compute4   provisioned   foobar-tlms8-worker-0-k2qkk   true
openshift-machine-api   foobar-control1   provisioned   foobar-tlms8-master-0         true
openshift-machine-api   foobar-control2   provisioned   foobar-tlms8-master-1         true
openshift-machine-api   foobar-control3   provisioned   foobar-tlms8-master-2         true
  • Verify that oc get nodes is missing the related worker nodes:
NAME                           STATUS   ROLES    AGE   VERSION
example-compute1.example.com   Ready    worker   13m   v1.21.6+bb8d50a
example-compute2.example.com   Ready    worker   13m   v1.21.6+bb8d50a
example-compute4.example.com   Ready    worker   14m   v1.21.6+bb8d50a
example-control1.example.com   Ready    master   52m   v1.21.6+bb8d50a
example-control2.example.com   Ready    master   55m   v1.21.6+bb8d50a
example-control3.example.com   Ready    master   55m   v1.21.6+bb8d50a
NAME                          STATUS   ROLES    AGE   VERSION
foobar-compute2.foobar.com    Ready    worker   14m   v1.21.6+bb8d50a
foobar-compute4.foobar.com    Ready    worker   16m   v1.21.6+bb8d50a
foobar-control1.foobar.com    Ready    master   57m   v1.21.6+bb8d50a
foobar-control2.foobar.com    Ready    master   58m   v1.21.6+bb8d50a
foobar-control3.foobar.com    Ready    master   58m   v1.21.6+bb8d50a

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments