ip resource failed with "Link for <iface>: Not detected" in a RHEL 5 or 6 High Availability cluster with rgmanager
Issue
- There was a partial network outage due to which cluster services went down.
- "kernel: e1000e: eth0 NIC Link is Down" messages are found in the nodes after which the cluster service failed due to non-availability of the IP resource.
- There was a failure of my service following error messages from a network driver and our
ipresource
Mar 24 23:44:11 nodeA kernel: e1000: eth2: e1000_watchdog_task: NIC Link is Down
Mar 24 23:44:15 nodeA kernel: e1000: eth0: e1000_watchdog_task: NIC Link is Down
Mar 24 23:44:15 nodeA kernel: bonding: bond0: now running without any active interface !
Mar 24 23:44:20 nodeA clurgmgrd: [7851]: <warning> Link for bond0: Not detected
Mar 24 23:44:20 nodeA clurgmgrd: [7851]: <warning> No link on bond0...
Mar 24 23:44:20 nodeA clurgmgrd[7851]: <notice> Stopping service cluster_service
Mar 24 23:44:20 nodeA clurgmgrd: [7851]: <info> Executing /path/to/service stop
Mar 24 23:44:21 nodeA clurgmgrd: [7851]: <info> Removing IPv4 address XXX.XX.XX.XX from bond0
Environment
- Red Hat Enterprise Linux (RHEL) 5 or 6 with the High Availability Add On
rgmanager- One or more
<ip/>resources configured in/etc/cluster/cluster.conf
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
