Ceph: MDS Load Balancer "mds_bal_interval" should be disabled, multi-MDS

Solution Verified - Updated -

Issue

MDS Load Balancer mds_bal_interval should be disabled, multi-MDS

For sites with multiple active MDS, having the MDS Load Balancer (LB) enabled can cause poor performance.
Messages like these may be seen in the MDS logs:

2023-04-19T05:13:03.660+0000 7f767dd28700 -1 mds.0.bal find_exports  balancer runs too long
2023-04-19T05:13:04.256+0000 7f767dd28700  1 mds.xhhy.xhhy-edon02.jbdssd Updating MDS map to version 637888 from mon.4
2023-04-19T05:13:08.298+0000 7f767dd28700  1 mds.xhhy.xhhy-edon02.jbdssd Updating MDS map to version 637889 from mon.4
2023-04-19T05:13:15.766+0000 7f767dd28700 -1 mds.0.bal find_exports  balancer runs too long
2023-04-19T05:13:15.766+0000 7f767dd28700 -1 mds.0.bal find_exports  balancer runs too long
2023-04-19T05:13:15.766+0000 7f767dd28700 -1 mds.0.bal find_exports  balancer runs too long

For proper understanding, the MDS LB redistributes metadata across the file system ranks in response to load on the file system. It should NOT be confused with balancing incoming traffic as is done by an HA Proxy or F5 LB.

Example (7 active MDSs):

[root@edon02 ~]# ceph -s
  cluster:
    id:     948cdxxx-Redacted-Cluster-ID-yyycef5fc180
    health: HEALTH_WARN
            2 MDSs report slow requests

  services:
    mon: 5 daemons, quorum xhhy-edon01,xhhy-edon05,xhhy-edon02,xhhy-edon03,xhhy-edon04 (age 16m)
    mgr: xhhy-edon04.enoglz(active, since 24m), standbys: xhhy-edon02.uowgkl, xhhy-edon05.njrwql, xhhy-edon01.iujrjy
    mds: 7/7 daemons up, 3 standby
    osd: 584 osds: 584 up (since 2w), 584 in (since 8w)
    rgw: 10 daemons active (10 hosts, 1 zones)

Environment

Red Hat Ceph Storage (RHCS) 4.x
Red Hat Ceph Storage (RHCS) 5.x
Red Hat Ceph Storage (RHCS) 6.x
Red Hat Ceph Storage (RHCS) 7.x

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content