Red Hat Gluster Storage : Memory consumption for Self-Heal daemon increases with toggling "cluster.self-heal-daemon" volume set option
Environment
- Red Hat Gluster Storage 3.X
Issue
- Memory consumption for Self-Heal daemon increases with toggling "cluster.self-heal-daemon" volume set option.
- Why Self-Heal daemon is consuming very high memory.
- Self-Heal memory consumption is very high.
Resolution
- In general, disabling
self-heal daemonis not recommended and toggling it very frequently is strictly prohibited. - With recommendation given above, there is very rare chance to get affected by the identified leak.
-
Once you hit the issue, please follow the given
workaround:- Search for PID of Self-Heal daemon(glustershd)
# ps aux | grep glustershd- Kill glustershd process
# kill -9 <glustershd_pid>- Restart volume with "force" option
# gluster volume <volname> start forceNote: The workaround above doesn't affect ongoing IO and Management traffic. Thus, it is a safe workaround and doesn't impact anything. This is a generic workaround and can be applied in any situation where high memory consumption for glustershd process is observed.
Root Cause
- This is a
known issue, there ismemory leakin graph switch path. - The issue is already captured in Bug 1529501 and work to fix the leak is in progress.
- For real-time progress on the bug, please follow up via Bugzilla or contact Red Hat Support.
Diagnostic Steps
- Run volume set command for "cluster.self-heal-daemon" in a loop and observe memory consumption.
For example :
# for i in {1..300};do gluster volume set VOLNAME$i cluster.self-heal-daemon off;sleep 3 done
300 volume can be created before running above script.
- Self-heal daemon occupies almost 4.6G of Resident space in memory post all volume set operations.
BEFORE VOL SET :
# ps aux|grep glus
root 8078 12.4 2.6 28807468 1315220 ? Ssl 05:13 0:28 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/run/gluster/glustershd/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/e32d8903c5b60efed5cc4e725235c143.socket --xlator-option *replicate*.node-uuid=cedc8e7d-d3a0-47f2-a50e-ebe12fe964bc
AFTER VOL SET :
# ps aux|grep glustershd
root 8078 3.0 9.4 31756588 4677648 ? Ssl 05:13 3:56 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/run/gluster/glustershd/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/e32d8903c5b60efed5cc4e725235c143.socket --xlator-option *replicate*.node-uuid=cedc8e7d-d3a0-47f2-a50e-ebe12fe964bc
- It keeps increasing with each shd option toggle.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments