RHEL 6 High Availability cluster node gets fenced after its corosync process is sitting in function audit_log_start during the period where it failed to send its token

Solution In Progress - Updated -

Issue

  • A node gets fenced frequently in our cluster
  • We keep seeing nodes fenced, and the ha-resourcemon's ps output shows corosync sitting in function audit_log_start during the window where it should be sending tokens but is apparently unresponsive
  • Why is corosync getting stuck behind audit causing a node to get fenced?

Environment

  • Red Hat Enterprise Linux (RHEL) 6 with the High Availability Add-On
  • audit
  • Some sort of audit watch or rule that may trigger on corosync's operations and activities

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In
Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.