8021q + bond (LACP etherchannel)

Latest response

I am trying to configure LACP Etherchannel Trunk (8021q) from RHEL server to Cumulus Linux. I can get it up and running and stable but the initial setup is very frustrating. It does not seem consistent or automatable to be able to bring up a bonded trunk.

Lets say you have Cumulus Linux switch
CL SWP1-> eth1 RHEL
CL SWP2->eth2 RHEL

On the RHEL side->
/etc/sysconfig/network-scripts/ifcfg-eth1

DEVICE=eth1
NAME=bond0-slave
TYPE=Ethernet
BOOTPROTO=none
ONBOOT=yes
MASTER=bond0
SLAVE=yes

For the bond
/etc/sysconfig/network-scripts/ifcfg-bond0

DEVICE=bond0
NAME=bond0
TYPE=Bond
BONDING_MASTER=yes
IPADDR=1.1.1.1
PREFIX=24
ONBOOT=yes
BOOTPROTO=none
BONDING_OPTS="mode=4 miimon=100 lacp_rate=1 xmit_hash_policy=layer3+4"

For this example I will just do one VLAN, but you can have obviously thousands of these
/etc/sysconfig/network-scripts/ifcfg-bond0.400

DEVICE=bond0.400
IPADDR=10.11.0.100
PREFIX=24
ONBOOT=yes
BOOTPROTO=none
VLAN=yes

So with these flat files we do this->
modprobe bonding
modprobe 8021q
ifup ifcfg-eth1

This is where the problem is->
Error: Connection activation failed: Master device bond0 unmanaged or not available for activation

So even if I try to up the bond0 first (and this is with vagrant so I can destroy and start from scratch)
Error: Connection activation failed: Master device bond0 unmanaged or not available for activation

So its like a chicken and egg thing? I have found that I have to 'jiggle' the bond. I literally can't figure out the magic combination to get it up. Once its up it will work great and stay up unless i do a vagrant destroy on the RHEL VM.

One time to get it working I did a
ifdown bond0
ip link del bond0
ifdown eth1
ifup eth1

Then it came online magically. But this will not happen every time. Once its up I can do a nmcli conn reload and its stable... but this initial setup seems very sketchy/flaky and makes using Ansible impossible since I basically have to ignore_errors: yes on everything :(

What should I do here? What additional information do you need? The bond on the Cumulus Linux is not the problem (bonds to Cisco, Arista, Debian, Ubuntu, etc) so this is definitely a RHEL/CentOS thing.

Responses