A bonding's primary setting in /etc/sysconfig/network-scripts/ifcfg-bond0 is not effective

Latest response

I have a bonding configured in few RHEL servers with the following settings. However, I notice that the Bonding_OPTS are not effective. When we pull a cable or the poer off the switch that a NIC port(that is in Bond)is conencted, we experience the "Request Timed out" with the below mentioned settings.

A bonding's primary setting in /etc/sysconfig/network-scripts/ifcfg-bond0 is not effective.

BONDING_OPTS parameters are as follows:

grep BONDING_OPTS /etc/sysconfig/network-scripts/ifcfg-bond0

BONDING_OPTS="mode=1 miimon=50 updelay=50 downdelay=50 primary=eth0"

cat /proc/net/bonding/bond0

Ethernet Channel Bonding Driver: v3.6.0 (September 26, 2009)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: eth0 (primary_reselect always)
Currently Active Slave: eth0
MII Status: up
MII Polling Interval (ms): 50
Up Delay (ms): 50
Down Delay (ms): 50

Slave Interface: eth0
MII Status: up
Speed: 100 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: ac:16:2d:a7:a0:a8
Slave queue ID: 0

Slave Interface: eth1
MII Status: up
Speed: 100 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: ac:16:2d:a7:a0:a9
Slave queue ID: 0

Slave Interface: eth2
MII Status: down
Speed: Unknown
Duplex: Unknown
Link Failure Count: 0
Permanent HW addr: ac:16:2d:a7:a0:aa
Slave queue ID: 0

Slave Interface: eth3
MII Status: down
Speed: Unknown
Duplex: Unknown
Link Failure Count: 0
Permanent HW addr: ac:16:2d:a7:a0:ab
Slave queue ID: 0

I searched the RHEL knowledgebase for resolutions and found some documents saying the following:

Updating to kernel version 2.6.18-164.6.1 or later version or newer resolves this issue. If you use kernel-2.6.18-164.6.1.el5 or older, you need to set the bonding option in /etc/modprobe.conf:

options bonding mode=1 miimon=100 primary=eth0

However, I checked the my kernel status and found that we are running Linux PMC-HR 2.6.32-431.5.1.el6.x86_64 #1 SMP Fri Jan 10 14:46:43 EST 2014 x86_64 x86_64 x86_64 GNU/Linux.

I still added the entry as mentioned by RHEL for older version of the kernel and rebooted the server. Still the when a network cable is pulled from eth0 or eth1, the card does not detect the failure even though the link light is off.

We would greatly appreciate your assistance and expert advise on this issue that we are encountering.

I also attach the configuration files that I use for your review.

Regards
Jo

Attachments

Responses

Each of the ifcfg-ethX files should have a HWADDR entry containing each interface's unique MAC address. This is usually how RHEL5 persistently names network interfaces. However that may not be the problem, it's just "good housekeeping".

You shouldn't need to add the bonding options into modprobe.conf on the recent version you are using. We recommend against using modprobe.conf for this, so the only place the bonding options are specified is in BONDING_OPTS.

As you've found, the problem is that the link-state change is not detected. Try adding use_carrier=0 to your BONDING_OPTS. That changes the way link-state is queried from the hardware. Alternately, you could use arp_interval and arp_ip_target instead of miimon.

You might also wish to ask the hardware vendor (it appears to be HP) if there is an updated firmware for your network interfaces, which might allow the link-state change to be communicated properly.

The vendor might also have their own driver version. You're welcome to use this, though keep in mind that third party drivers would move support for the network interface to the driver vendor instead of being supported by Red Hat. If the vendor driver works but our driver doesn't, please do open a support case and we'll try to get our driver repaired.

Additional references: