Intelligently autoconfiguring LACP/802.3ad?
Hello, everyone --
I was wondering if it's possible for a host to determine which network interfaces are supposed to be included in a LACP bond without sniffing for LACPDUs on each interface.
Let me add a little bit more detail around the scenario that I'm thinking about.
Let's say that I have a host with interfaces eth[0123], and all four interfaces are plugged into a switch. On the switch, the switchports that eth0 and eth1 are plugged into are configured with (Cisco syntax) "channel-group X mode active", but eth2 and eth3 are not in a port-channel. Is it possible, without sniffing frames off each interface, for the host to "see" that eth0 and eth1 should be configured as a bond with 802.3ad/LACP while eth2 and eth3 should be configured individually? Has anyone done this before?
If possible, I'd like to do this at the tail-end of a Kickstart, or maybe from a Puppet-script.
I'm considering lumping all of the interfaces into a bond, then looking at /proc/net/bonding/bond0 to see which interfaces aren't getting along with the rest of them, and pulling the dissidents out. But that seems clunky and I wanted to see if anyone had done anything like this before.
Thanks for your help!
-joe.-
Responses
Both the methods you covered were the first things which sprung to my mind as well.
You could also just brute-force the pair - script bond configuration in all possible pairs (eth0+eth1, eth0+eth2, and so on through to eth2+eth3) and stop when you get to a bond which has two aggregated slaves.
Personally I like your first idea of capturing Slow Protocols on each interface and marking it as "to bond" or "not to bond".
LACP is a negotiated configuration. The only way I'm aware of to tell if a link/path is LACP-ready is to look for the handshake packets sent from the (other) active LACP end-point (i.e., "snoop the interfaces").
This, of course, assumes that your switch is set to LACP active-mode. If the switch is in passive-mode, you'd have to solicit, first (sequentially create single-NIC "bundles" to force the LACP capability-query to be sent out, record which links get a reply, then configure your real bundles).
The down side to the brute force method (against passive-mode switch-configs) is that, if your host is configured across redundant switches and the switches haven't been configured to allow switch-spanning bundles, your automated-results may be sub-optimal. You can also get sub-optimal results if your host-side ports are using different NIC-drivers (so, you'll want logic to address this possibility).
Does anyone know if Spanning Tree would flip out of if you configured your 4 interfaces to be in one port-group, yet the interfaces were instead plugged in to potentially several groups? In particular, if that host-side bond presented a single mac address across all the interfaces (or is that only active-passive that does that?).
Also - another reason I would anticipate why that plan may not work. If you happen to be kickstarting your host and your kickstart expects to use eth0 (em1) based on the VLAN for your environment, but instead... eth3 ends up being the primary/preferred for the bond (which happens to be on the wrong VLAN), I would expect that to fail.
I have wondered the same thing as you - does a switch port "advertise" configuration data (in addition to the capabilities that it may advertise - i.e. speed, duplex, etc...)?
MACs are generally discrete in a LACP-bond: it's part of how the negotiation keeps track of which leg of the bond to send packets across and keep things properly sequenced.
KickStart is one of the reasons why you'd be running your switches in passive LACP-mode. It allows you to run in "standard" mode until such time as the other end-point announces, "I'm ready for LACP: how bout you?"
As to capabilities advertisement, it's more a question of "what do you mean by advertise". For GigE and higher, autoneg is the default mode (used to be, some 10/100/1000 NICs simply wouldn't run above 100 if you didn't autoneg). Capabilities are "advertised" in as much as the endpoints say to each other, "hey, wanna talk? You do? Can you talk at this speed? You can? Awesome: let's go."
What I've observed from attempting to LACP-bond interfaces with mismatched drivers (or bundles that span switch-pairs that don't support bond-spanning), is that one set will get one aggregator ID and the other set will get another aggregator ID. You should see similar ID-groupings if there switch side is presenting two different bonding-groups.
Joe, I was playing with two LLDP clients on RHEL 6 and 7 with a Cat 4948E running 15.2(1)E3. The client which provides the most promise is https://github.com/vincentbernat/lldpd/. There are premade RPMs available here: http://software.opensuse.org/download.html?project=home:vbernat&package=lldpd. I tried the "lldpad" RPM made available by Red Hat, but it does not offer much help (in my limited effort). The "lldpd" and "lldpcli" offer a "show neighbors" which can show link aggregation details (as referenced here: https://github.com/vincentbernat/lldpd/issues/36). Unfortunately I was unable to see these link aggregation details when running it against the Cat4948E using either CDP or LLDP. I plan to try it on Cat 3750G switches and possibly some others in my lab.
There is a documented implementation of using LLDP to detect when LACP should be dynamically configured (http://rickardnobel.se/lacp-and-esxi-5-1/ references VMware ESXi and their virtual switch).
I'm a little late to the party, but for others wandering in from Google, may I suggest quoting the "mode=active-active" statement? produce this network command:
network --device=bond0 --bootproto=dhcp --onboot=yes --bondopts="mode=active-active" --bondslaves=eno1,eno2 --activate
.. and see whether the quoting reduces stumbling around the two equals signs?
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
