Red Hat Training

A Red Hat training course is available for Red Hat Gluster Storage

Chapter 15. Troubleshooting

This chapter describes the most common troubleshooting scenarios related to Container-Native Storage.
  • What to do if a Container-Native Storage node Fails

    If a Container-Native Storage node fails, and you want to delete it, then, disable the node before deleting it. For more information see, Section 12.2.3, “Deleting Node”.
    If a Container-Native Storage node fails and you want to replace it, refer Section 12.2.3.3, “Replacing a Node”.
  • What to do if a Container-Native Storage device fails

    If a Container-Native Storage device fails, and you want to delete it, then, disable the device before deleting it. For more information see, Section 12.2.2, “Deleting Device ”.
    If a Container-Native Storage device fails, and you want to replace it, refer Section 12.2.2.3, “Replacing a Device”.
  • What to do if Container-Native Storage volumes require more capacity:

    You can increase the storage capacity by either adding devices, increasing the cluster size, or adding an entirely new cluster. For more information see Section 12.1, “Increasing Storage Capacity”.
  • How to upgrade Openshift when Container-Native Storage is installed

  • Viewing Log Files

    • Viewing Red Hat Gluster Storage Container Logs
      Debugging information related to Red Hat Gluster Storage containers is stored on the host where the containers are started. Specifically, the logs and configuration files can be found at the following locations on the openshift nodes where the Red Hat Gluster Storage server containers run:
      • /etc/glusterfs
      • /var/lib/glusterd
      • /var/log/glusterfs
    • Viewing Heketi Logs
      Debugging information related to Heketi is stored locally in the container or in the persisted volume that is provided to Heketi container.
      You can obtain logs for Heketi by running the docker logs container-id command on the openshift node where the container is being run.
  • Heketi command returns with no error or empty error like Error

    Sometimes, running heketi-cli command returns with no error or empty error like Error. It is mostly due to heketi server not properly configured. You must first ping to validate that the Heketi server is available and later verify with a curl command and /hello endpoint.

  • Heketi reports an error while loading the topology file

    Running heketi-cli reports : Error "Unable to open topology file" error while loading the topology file. This could be due to the use of old syntax of single hyphen (-) as prefix for json option. You must use the new syntax of double hyphens and reload the topology file.

  • cURL command to heketi server fails or does not respond

    If the router or heketi is not configured properly, error messages from the heketi may not be clear. To troubleshoot, ping the heketi service using the endpoint and also using the IP address. If ping by the IP address succeeds and ping by the endpoint fails, it indicates a router configuration error.

    After the router is setup properly, run a simple curl command like the following:
    # curl http://deploy-heketi-storage-project.cloudapps.mystorage.com/hello
    If heketi is configured correctly, a welcome message from heketi is displayed. If not, check the heketi configuration.
  • Heketi fails to start when Red Hat Gluster Storage volume is used to store heketi.db file

    Sometimes Heketi fails to start when Red Hat Gluster Storage volume is used to store heketi.db and reports the following error:

    [heketi] INFO 2016/06/23 08:33:47 Loaded kubernetes executor
    [heketi] ERROR 2016/06/23 08:33:47 /src/github.com/heketi/heketi/apps/glusterfs/app.go:149: write /var/lib/heketi/heketi.db: read-only file system
    ERROR: Unable to start application
    The read-only file system error as shown above could be seen while using a Red Hat Gluster Storage volume as backend. This could be when the quorum is lost for the Red Hat Gluster Storage volume. In a replica-3 volume, this would be seen if 2 of the 3 bricks are down. You must ensure the quorum is met for heketi gluster volume and it is able to write to heketi.db file again.
    Even if you see a different error, it is a recommended practice to check if the Red Hat Gluster Storage volume serving heketi.db file is available or not. Access deny to heketi.db file is the most common reason for it to not start.