Chapter 11. Troubleshooting

This chapter describes the most common troubleshooting scenarios related to Red Hat Openshift Container Storage.

What to do if a Red Hat Openshift Container Storage node Fails

If a Red Hat Openshift Container Storage node fails, and you want to delete it, then, disable the node before deleting it. For more information, see Section 1.2.3, “Deleting Node”.

If a Red Hat Openshift Container Storage node fails and you want to replace it, see Section 1.2.3.3, “Replacing a Node”.

What to do if a Red Hat Openshift Container Storage device fails

If a Red Hat Openshift Container Storage device fails, and you want to delete it, then, disable the device before deleting it. For more information, see Section 1.2.2, “Deleting Device”.

If a Red Hat Openshift Container Storage device fails, and you want to replace it, see Section 1.2.2.3, “Replacing a Device”.

What to do if Red Hat Openshift Container Storage volumes require more capacity
You can increase the storage capacity by either adding devices, increasing the cluster size, or adding an entirely new cluster. For more information, see Section 1.1, “Increasing Storage Capacity”.
How to upgrade Openshift when Red Hat Openshift Container Storage is installed
To upgrade Openshift Container Platform, see https://access.redhat.com/documentation/en-us/openshift_container_platform/3.11/html/upgrading_clusters/install-config-upgrading-automated-upgrades#upgrading-to-ocp-3-10.
Viewing Log Files
  • Viewing Red Hat Gluster Storage Container Logs

    Debugging information related to Red Hat Gluster Storage containers is stored on the host where the containers are started. Specifically, the logs and configuration files can be found at the following locations on the openshift nodes where the Red Hat Gluster Storage server containers run:

    • /etc/glusterfs
    • /var/lib/glusterd
    • /var/log/glusterfs
  • Viewing Heketi Logs

    Debugging information related to Heketi is stored locally in the container or in the persisted volume that is provided to Heketi container.

    You can obtain logs for Heketi by running the docker logs <container-id> command on the openshift node where the container is being run.

Heketi command returns with no error or empty error

Sometimes, running heketi-cli command returns with no error or empty error like _ Error_.It is mostly due to heketi server not properly configured. You must first ping to validate that the Heketi server is available and later verify with a _ curl_ command and _ /hello endpoint_.

# curl http://deploy-heketi-storage-project.cloudapps.mystorage.com/hello
Heketi reports an error while loading the topology file
Running heketi-cli reports : Error "Unable to open topology file" error while loading the topology file. This could be due to the use of old syntax of single hyphen (-) as a prefix for JSON option. You must use the new syntax of double hyphens and reload the topology file.
cURL command to heketi server fails or does not respond

If the router or heketi is not configured properly, error messages from the heketi may not be clear. To troubleshoot, ping the heketi service using the endpoint and also using the IP address. If ping by the IP address succeeds and ping by the endpoint fails, it indicates a router configuration error.

After the router is setup properly, run a simple curl command like the following:

# curl http://deploy-heketi-storage-project.cloudapps.mystorage.com/hello

If heketi is configured correctly, a welcome message from heketi is displayed. If not, check the heketi configuration.

Heketi fails to start when Red Hat Gluster Storage volume is used to store heketi.db file

Sometimes Heketi fails to start when Red Hat Gluster Storage volume is used to store heketi.db and reports the following error:

[heketi] INFO 2016/06/23 08:33:47 Loaded kubernetes executor
[heketi] ERROR 2016/06/23 08:33:47 /src/github.com/heketi/heketi/apps/glusterfs/app.go:149: write /var/lib/heketi/heketi.db: read-only file system
ERROR: Unable to start application

The read-only file system error as shown above could be seen while using a Red Hat Gluster Storage volume as backend. This could be when the quorum is lost for the Red Hat Gluster Storage volume. In a replica-3 volume, this would be seen if 2 of the 3 bricks are down. You must ensure the quorum is met for heketi gluster volume and it is able to write to heketi.db file again.

Even if you see a different error, it is a recommended practice to check if the Red Hat Gluster Storage volume serving heketi.db file is available or not. Access deny to heketi.db file is the most common reason for it to not start.