Galera Cluster node failed, won't rejoin the cluster

Solution In Progress - Updated -

Issue

  • We have a Galera cluster using rh-mariadb101 with 3 nodes. One node failed, and will not rejoin the cluster. There is nothing in the error logs to indicate why it won't rejoin; just the service starting, beginning replication from another node, and then shutting down. We expect the node to complete replication and rejoin the cluster.

  • We suspect a file system corruption because df -Th shows only 1G in use where the other two nodes show over 50G in use, although xfs_repair doesn't appear to have done anything and the directories all look right.

Environment

  • Red Hat Enterprise Linux 7.7 (RHEL)
  • Red Hat Software Collections (RHSC)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content