GFS2 cluster very low read performance
Hello,
In one of my customers we are having problems with backing up several GFS2 data shares. We have 2 cluster nodes of RHEL 6.7 configured in active-active cluster with 7 GFS2 mountpoints. Mountpoints are currently looking like below:
/dev/mapper/1 384G 291G 94G 76% /data1
/dev/mapper/2 384G 280G 105G 73% /data2
/dev/mapper/3 384G 265G 120G 69% /data3
/dev/mapper/4 384G 298G 87G 78% /data4
/dev/mapper/5 768G 549G 220G 72% /data5
/dev/mapper/6 1.2T 1.2T 26G 98% /data6
/dev/mapper/7 100G 92G 8.9G 92% /data7
Mounted like this
/dev/mapper/1 on /data1 type gfs2 (rw,noatime,nodiratime,hostdata=jid=1)
/dev/mapper/2 on /data2 type gfs2 (rw,noatime,nodiratime,hostdata=jid=1)
/dev/mapper/3 on /data3 type gfs2 (rw,noatime,nodiratime,hostdata=jid=1)
/dev/mapper/4 on /data4 type gfs2 (rw,noatime,nodiratime,hostdata=jid=1)
/dev/mapper/5 on /data5 type gfs2 (rw,noatime,nodiratime,hostdata=jid=1)
/dev/mapper/6 on /data6 type gfs2 (rw,noatime,nodiratime,hostdata=jid=1)
/dev/mapper/7 on /data7 type gfs2 (rw,noatime,nodiratime,hostdata=jid=1)
Backup system is Networker and backup is taken from both nodes (at different times). Unfortunatelly lately backup keeps on failing, as files are being copied at very low speeds or stuck completely at 0 bytes transfer. We've tried to remout GFS2 systems and check if that help, but it's same, we've tried to drop caches (echo -n 3 >/proc/sys/vm/drop_caches) but didn't help either.
For some reason iotop after remounting showed improvement but after few seconds all disk reads were by "glock_workqueue" process:
TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND
5014 be/4 root 209.29 K/s 0.00 B/s 0.00 % 31.60 % [glock_workqueue]
5012 be/4 root 205.48 K/s 0.00 B/s 0.00 % 29.61 % [glock_workqueue]
5013 be/4 root 144.60 K/s 0.00 B/s 0.00 % 25.92 % [glock_workqueue]
5011 be/4 root 49.47 K/s 0.00 B/s 0.00 % 7.28 % [glock_workqueue]
32677 be/4 root 7.61 K/s 0.00 B/s 0.00 % 3.86 % save -a (...) /data5 /data5
9603 be/4 root 7.61 K/s 0.00 B/s 0.00 % 1.21 % save -a (...) /data6 /data6
1474 be/3 root 0.00 B/s 22.83 K/s 0.00 % 0.40 % [jbd2/dm-7-8]
I've checked if writing is also degraded but doing dd 1GB file on one of mountpoints went fine with 25MB/s transfer.
Is it possible to solve that issue or the only solution is to implement cluster aware backup software?