3.3. Troubleshooting common issues with installing on Red Hat Virtualization (RHV)

Here are some common issues you might encounter, along with proposed causes and solutions.

3.3.1. CPU load increases and nodes go into a Not Ready state

  • Symptom: CPU load increases significantly and nodes start going into a Not Ready state.
  • Cause: The storage domain latency might be too high, especially for master nodes.
  • Solution:

    Make the nodes ready again by restarting the kubelet service. Enter:

    $ systemctl restart kubelet

    Inspect the OpenShift Container Platform metrics service, which automatically gathers and reports on some valuable data such as the etcd disk sync duration. If the cluster is operational, use this data to help determine whether storage latency or throughput is the root issue. If so, consider using a storage resource that has lower latency and higher throughput.

    To get raw metrics, enter the following command as kubeadmin or user with cluster-admin privileges:

    $ oc get --insecure-skip-tls-verify --server=https://localhost:<port> --raw=/metrics`

    To learn more, see Exploring Application Endpoints for the purposes of Debugging with OpenShift 4.x

3.3.2. Trouble connecting the OpenShift Container Platform cluster API

  • Symptom: The installation program completes but the OpenShift Container Platform cluster API is not available. The bootstrap virtual machine remains up after the bootstrap process is complete. When you enter the following command, the response will time out.

    $ oc login -u kubeadmin -p *** <apiurl>
  • Cause: The bootstrap VM was not deleted by the installation program and has not released the cluster’s API IP address.
  • Solution: Use the wait-for subcommand to be notified when the bootstrap process is complete:

    $ ./openshift-install wait-for bootstrap-complete

    When the bootstrap process is complete, delete the bootstrap virtual machine:

    $ ./openshift-install destroy bootstrap