glusterd will not start with error on glusterd.log : resolve brick failed in restore
Issue
The glusterd service will not start on a node in the cluster. No gluster command will run on the node. In the /var/log/glusterfs/glusterd.log
log, a message like this is found:
[2021-07-09 07:18:27.111679] E [MSGID: 106187] [glusterd-store.c:4962:glusterd_resolve_all_bricks] 0-glusterd: Failed to resolve brick /mnt/data1/1 with host gluster1.example.com of volume sample_volname in restore
The problem likely lies in the /var/lib/glusterd/peers
directory on the node for which glusterd will not start. The /var/lib/glusterd/peers
directory will contain a file for every other node in the cluster. For example, for a three-node cluster, the contents of the peers directory should look something like this:
[root@gluster2 peers]# ssh root@gluster1 ls -1 /var/lib/glusterd/peers
3d1a24c2-4097-45a5-8ae1-8f0451bc6ed5
c43fe588-e9d5-4fc8-93f8-2c817f903825
[root@gluster2 peers]# ls -1 /var/lib/glusterd/peers
acb80b35-d6ac-4085-87cd-ba69ff3f81e6
c43fe588-e9d5-4fc8-93f8-2c817f903825
[root@gluster2 peers]# ssh root@gluster3 ls -1 /var/lib/glusterd/peers
3d1a24c2-4097-45a5-8ae1-8f0451bc6ed5
acb80b35-d6ac-4085-87cd-ba69ff3f81e6
Each node has peer files for the other two nodes. Each file contains the UUID and the hostname of the node they represent:
[root@gluster2 ~]# cat /var/lib/glusterd/peers/acb80b35-d6ac-4085-87cd-ba69ff3f81e6
uuid=acb80b35-d6ac-4085-87cd-ba69ff3f81e6
state=3
hostname1=gluster1.example.com
In the case where glusterd will not start on a node, and the /var/log/glusterfs/glusterd.log
contains a "resolve brick failed in restore" error message, the /var/lib/glusterd/peers
directory is likely missing the peer file for one of the other nodes.
[root@gluster2 peers]# ssh root@gluster1 ls -1 /var/lib/glusterd/peers
3d1a24c2-4097-45a5-8ae1-8f0451bc6ed5
c43fe588-e9d5-4fc8-93f8-2c817f903825
[root@gluster2 peers]# ls -1 /var/lib/glusterd/peers
acb80b35-d6ac-4085-87cd-ba69ff3f81e6
[root@gluster2 peers]# ssh root@gluster3 ls -1 /var/lib/glusterd/peers
3d1a24c2-4097-45a5-8ae1-8f0451bc6ed5
acb80b35-d6ac-4085-87cd-ba69ff3f81e6
Looking at the pattern of filenames, we can tell the missing file on node 2 is c43fe588-e9d5-4fc8-93f8-2c817f903825 and that it is the node file for node 3.
Environment
- Glusterfs 3.X
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.