Why Are Some Gluster Files Being Constantly Healed?

Issue

Running the command gluster v heal test-vol info, it shows an stuck entry, needing to be healed:

gluster volume heal test-vol info

Brick node1:/brick1/brick
file1
Status: Connected
Number of entries: 1

Brick node2:/brick2/brick
Status: Connected
Number of entries: 0

Brick node3:/brick3/arbiter
file
Status: Connected
Number of entries: 1

At the time of collecting the extended attributes of this entry, two of the nodes point at a third one, as the source of the heal:

[root@node1 ~]# getfattr -d -m . -e hex /brick1/brick/file1

getfattr: Removing leading '/' from absolute path names
# file: /brick1/brick/file
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.test-vol-client-1=0x0000001a0000000000000000
trusted.gfid=0xd2dc2973dfff404a8e7b23e4d6e7b83b
trusted.glusterfs.shard.block-size=0x0000000000400000
trusted.glusterfs.shard.file-size=0x0000000000100000000000000000000000000000000008000000000000000000

[root@node2 ~]# getfattr -d -m . -e hex /brick2/brick/file1

getfattr: Removing leading '/' from absolute path names
# file: /brick2/brick/file1
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.test-vol-client-0=0x000000000000000000000000
trusted.afr.test-vol-client-2=0x000000000000000000000000
trusted.gfid=0xd2dc2973dfff404a8e7b23e4d6e7b83b
trusted.glusterfs.shard.block-size=0x0000000000400000
trusted.glusterfs.shard.file-size=0x0000000000100000000000000000000000000000000008000000000000000000

[root@node3 ~]# getfattr -d -m . -e hex /brick3/brick/file1

getfattr: Removing leading '/' from absolute path names
# file: /brick3/brick/file1
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.test-vol-client-1=0x0000001d0000000000000000
trusted.gfid=0xd2dc2973dfff404a8e7b23e4d6e7b83b
trusted.glusterfs.shard.block-size=0x0000000000400000
trusted.glusterfs.shard.file-size=0x0000000000100000000000000000000000000000000008000000000000000000

From the above, nodes 1 and 3 ( clients 0 and 2 for the AFR xlator ) agree that node 2 ( client 1 ), is the one that needs to be healed. Looking at the log files of the self-healing daemon, available at /var/log/glusterfs/glustershd.log, this is exactly the case. This file is being healed from these two nodes in a loop:

 [2020-06-29 12:11:25.034839] I [MSGID: 108026] [afr-self-heal-common.c:1212:afr_log_selfheal] 0-test-vol-replicate-0: Completed data selfheal on d2dc2973-dfff-404a-8e7b-23e4d6e7b83b. sources=[0] 2  sinks=1
[2020-06-29 12:11:27.819586] I [MSGID: 108026] [afr-self-heal-common.c:1212:afr_log_selfheal] 0-test-vol-replicate-0: Completed data selfheal on d2dc2973-dfff-404a-8e7b-23e4d6e7b83b. sources=[0] 2  sinks=1
[2020-06-29 12:41:32.100811] I [MSGID: 108026] [afr-self-heal-common.c:1212:afr_log_selfheal] 0-test-vol-replicate-0: Completed data selfheal on d2dc2973-dfff-404a-8e7b-23e4d6e7b83b. sources=[0] 2  sinks=1

Why is this occurring? How to fix this problem?

Environment

Red Hat Gluster Storage version 3.x

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Select Your Language

Why Are Some Gluster Files Being Constantly Healed?

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links