Galera Cluster node failed, won't rejoin the cluster

Solution In Progress - Updated -

Issue

  • We have a Galera cluster using rh-mariadb101 with 3 nodes. One node failed, and will not rejoin the cluster. There is nothing in the error logs to indicate why it won't rejoin; just the service starting, beginning replication from another node, and then shutting down. We expect the node to complete replication and rejoin the cluster.

  • We suspect a file system corruption because df -Th shows only 1G in use where the other two nodes show over 50G in use, although xfs_repair doesn't appear to have done anything and the directories all look right.

Environment

  • Red Hat Enterprise Linux 7.7 (RHEL)
  • Red Hat Software Collections (RHSC)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In