Why are Brick Processes Not Starting After an OCS 3.x Node Reboot?

Solution Verified - Updated -

Issue

  • After a restart of an Openshift Container Storage node, the bricks in the Gluster pod running in this node, are not starting. The output of the gluster v statuscommand shows the bricks hosted in the rebooted node as N/A. Taking the volume heketidbstorageas an example, this is the status observed after rebooting OCS node 10.0.0.1:

    sh-4.2# gluster volume status heketidbstorage
    Status of volume: heketidbstorage
    Gluster process                             TCP Port  RDMA Port  Online PID
    ------------------------------------------------------------------------------
    Brick 10.0.0.1:/var/lib/heketi/mounts/vg_XXXXXXXX/brick_XXXXXXXXX/brick       N/A       N/A        N       N/A
    Brick 10.0.0.2:/var/lib/heketi/mounts/vg_XXXXXXXX/brick_XXXXXXXXX/brick       49152     0          Y       251
    Brick 10.0.0.3:/var/lib/heketi/mounts/vg_XXXXXXXX/brick_XXXXXXXXX/brick       49152     0          Y       212
    

    The same thing in happening for the rest of the volumes.

  • Reviewing the output of a pscommand in the pod, the matching glusterfsdprocesses are not running.

  • As a workaround, restarting glusterd brings the bricks back online: systemctl restart glusterd.
  • How to reboot OCS nodes without any manual intervention afterwards, to get the bricks automatically online?

Environment

  • Openshift Container Storage 3.x

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content