clustered ip resource fails to start with 'IPv4 address collision' when using <dlm enable_fencing="0"/> in RHEL 6

Solution Verified - Updated -

Issue

  • I have a service running on my Node2 of the cluster and on rebooting the Node2 service switch over to Node 1 (which is expected) once the Node2 comes back online Node1 gets fenced and reboots and service hangs.
  • When a node boots up, it does not see the other node and proceeds to post-join fence it. While waiting for fencing to complete, it begins starting services, and gets an IP collision:
Sep 27 14:32:06 node2 rgmanager[4457]: Starting stopped service service:IP
Sep 27 14:32:06 node2 rgmanager[5356]: [ip] Adding IPv4 address 10.1.2.3/24 to eth0
Sep 27 14:32:07 node2 rgmanager[5402]: [ip] IPv4 address collision 10.1.2.3
Sep 27 14:32:07 node2 rgmanager[4457]: start on ip "10.1.2.3/24" returned 1 (generic error)
Sep 27 14:32:07 node2 rgmanager[4457]: #68: Failed to start service:IP; return value: 1
  • When a node gets fenced, the other node recovers the service before fencing completes and this results in an IP collision

Environment

  • Red Hat Enterprise Linux (RHEL) 5 or 6 with the High Availability Add On
  • <dlm enable_fencing="0"/> set in /etc/cluster/cluster.conf

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content