Machines are stuck in the provisioning state due to pending CSR

Solution Verified - Updated -

Environment

  • Azure Red Hat OpenShift 4(ARO)

Issue

  • Failed to add new nodes to the cluster.
  • Machines are stuck in the provisioning state.

Resolution

  • Approve all pending CSRs using the following command.
$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approve

Root Cause

  • CSRs should be approved automatically. On rare occasions, an issue may be encountered where a CSR is stuck in a pending state and the kubelet is unable to join the cluster successfully.
  • CSRs that are automatically generated by kubelets on instances provisioned by the machine-api will automatically attempt to join the cluster by issuing a CSR (certificate signing request).

Diagnostic Steps

  • Check the status of machines.
$ oc get machines -n openshift-machine-api

eexxxxx-cswxx-worker-useast1-jqsrf   Provisioned   Standard_D8s_v3   useast   1      7d21h
  • Check pending CSRs.
$ oc get csr
NAME        AGE     REQUESTOR                                                                   CONDITION
csr-8b2br   15m     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
csr-8vnps   15m     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
  • Check machines stuck in the provisioning state.
$ oc describe machine eexxxxx-cswxx-worker-useast1-jqsrf -n openshift-machine-api
Message:               machine successfully created
      Reason:                MachineCreationSucceeded
      Status:                True
      Type:                  MachineCreated

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments