glusterd will not start with error on glusterd.log : resolve brick failed in restore

Solution Verified - Updated -

Issue

The glusterd service will not start on a node in the cluster. No gluster command will run on the node. In the /var/log/glusterfs/glusterd.log log, a message like this is found:

[2021-07-09 07:18:27.111679] E [MSGID: 106187] [glusterd-store.c:4962:glusterd_resolve_all_bricks] 0-glusterd: Failed to resolve brick /mnt/data1/1 with host gluster1.example.com of volume sample_volname in restore

The problem likely lies in the /var/lib/glusterd/peers directory on the node for which glusterd will not start. The /var/lib/glusterd/peers directory will contain a file for every other node in the cluster. For example, for a three-node cluster, the contents of the peers directory should look something like this:

[root@gluster2 peers]# ssh root@gluster1 ls -1 /var/lib/glusterd/peers
3d1a24c2-4097-45a5-8ae1-8f0451bc6ed5
c43fe588-e9d5-4fc8-93f8-2c817f903825

[root@gluster2 peers]# ls -1 /var/lib/glusterd/peers
acb80b35-d6ac-4085-87cd-ba69ff3f81e6
c43fe588-e9d5-4fc8-93f8-2c817f903825

[root@gluster2 peers]# ssh root@gluster3 ls -1 /var/lib/glusterd/peers
3d1a24c2-4097-45a5-8ae1-8f0451bc6ed5
acb80b35-d6ac-4085-87cd-ba69ff3f81e6

Each node has peer files for the other two nodes. Each file contains the UUID and the hostname of the node they represent:

[root@gluster2 ~]# cat /var/lib/glusterd/peers/acb80b35-d6ac-4085-87cd-ba69ff3f81e6 
uuid=acb80b35-d6ac-4085-87cd-ba69ff3f81e6
state=3
hostname1=gluster1.example.com

In the case where glusterd will not start on a node, and the /var/log/glusterfs/glusterd.log contains a "resolve brick failed in restore" error message, the /var/lib/glusterd/peers directory is likely missing the peer file for one of the other nodes.

[root@gluster2 peers]# ssh root@gluster1 ls -1 /var/lib/glusterd/peers
3d1a24c2-4097-45a5-8ae1-8f0451bc6ed5
c43fe588-e9d5-4fc8-93f8-2c817f903825

[root@gluster2 peers]# ls -1 /var/lib/glusterd/peers
acb80b35-d6ac-4085-87cd-ba69ff3f81e6

[root@gluster2 peers]# ssh root@gluster3 ls -1 /var/lib/glusterd/peers
3d1a24c2-4097-45a5-8ae1-8f0451bc6ed5
acb80b35-d6ac-4085-87cd-ba69ff3f81e6

Looking at the pattern of filenames, we can tell the missing file on node 2 is c43fe588-e9d5-4fc8-93f8-2c817f903825 and that it is the node file for node 3.

Environment

  • Glusterfs 3.X

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content