How to test nic bonding?

Latest response

I believe that I just setup network bonding on two interfaces.   I want to test it to make sure that it will failover.  Is there a command that I can use to fail one interface over to the other?

 

Thank you.

 

Daryl

Responses

Just "ifdown <bonded_interface_with_active_status>" and check system messages or just disable port on switch/unplug the cable - it will be more real test :-)

Thank you.

 

This is much more challenging to do in the Solaris wold.  I guess I had my Solaris hat on, and didn't even think about something as simple as ifdown and ifup.

 

Daryl

Depending how you set up the trunks, anything short of an electrical interruption to the media-path won't cause the trunk to fail that path. Linux, you can "lie" to the system in any number of ways to simulate a failure (great when you and your server are in different timezones).

We've always found that using 'ifdown' to simulate a network failure is not a good practice. The best option obviously is to physically remove it or have the network team disable the port. If that's not possible, use "ifenslave" to detach an interface from a bond. For example, if bond0=eth0,eth1 and eth0 is active, use "ifenslave -d bond0 eth0".

There is a /proc/net/bonding/bond0 file, which you can cat.

 

This will display the bonding interfaces and their slaves.

 

For an actual failover test, you could physically remove the cable or shut the port on the switch side.

... well, some "smart alec" stuff ( since the tool seems not to be too widely known and I was also surprised when I first saw it) : You can use "ethtool" ... It will at least reliably show you the status of your interface after any other modifications

You must have installed the "ethtool" package ...

All of eu already gave the solution.

 

You can run this command

 

# cat /proc/net/bonding/bond0

 

# ifdown eth0 / eth1

 

while eu down particular NIC card you can check the status in real time.

 

# watch cat /proc/net/bonding/bond0

 

or else

 

you can physically unplugged the cable from one NIC for testing purpose.

 

 

 

//shyfur

Additionaly to the previous answers i find ifenslave very handy

 

ifenslave -c|--change-active   <master-if> <slave-if>

I agree with Duane that ifdown is not a sufficient test.

 

Think of what you are trying to achieve with bonding. You want:

 

  • resiliency against electrical failure (eg: NIC fault, SPF fault/pull, cable fault/cut/pull)
  • resiliency against logical failure (eg: someone logs onto the switch and puts an access-list on your switchport)

Ideally you should do an actual test for both of these. One with a physical cable pull, and one with some sort of logical interruption like an ACL or VLAN change.

 

ifenslave to remove and add an interface is more a test of the bonding driver's slave functionality than its failover, but at least that will test that the bonded MAC fails over to the other interfaces (if the bonding mode does that) and traffic continues to flow during a fail scenario.

 

Note that remotely shutting the switchport doesn't always result in the NIC/driver considering the port to be "down". Some network interfaces seem to require a physical cable pull for miimon to consider the interface as failed.

Have a look http://www.kernel.org/doc/Documentation/networking/bonding.txt under "7. Link Monitoring".

Using ARP monitoring might solve the problem concerning the NIC not failing/reporting down when shutting down a switch port. I had the same problem with some blades in an old enclosure that kept showing link to presented NIC's even thus it lost both uplinks.

 

Testing, I would as mentioned unplug the cable and/or down the interface. I would not use ifdown but (ifconfig eth0 down/ifconfig eth0 up). Ifdown is a script that nicely down's the interface -> and it's not what i want, is it!!.

 

Well spotted :) The ARP monitor is a good option for blades, as some models have backplane connectivity and there's no "single cable" to pull for a given interface. It's also a good way to confirm your network can pass "actual traffic", as opposed to just link connectivity.

 

A few things to keep in mind for arp_mon:

 

Check link status much less than you would with miimon. A monitoring interval of 100ms is not unreasonable for miimon, as all that's involved is checking something in the driver on the system. The whole system call to check connectivity via ethtool is over less than 1ms.

 

For arp_monitor we can easily flood the network with ARP requests with a short interval. Enough systems all using the one ARP target could even overwhelm the target. It's better to set the monitoring interval to at least 1000ms (ie: 1 second) and perhaps as high as 10 seconds, depending on the network.

 

It is better to set multiple ARP targets, so that your systems don't all think their bonds have failed just because the router or switch (or other RHEL system you're using as an ARP target) has a scheduled maintenance reboot.

 

Of course, ensure your ARP target is in the same broadcast domain as the server itself. ARP is a layer 2 protocol and cannot route.

 

Setting an ARP target outside the local LAN will result in no incoming ARP reply, which may result in the arp_monitor thinking the interface is down just because no other hosts talk to it for the arp_interval (depending on RHEL version, this behaviour is different between RHEL4 and RHEL5).

Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.