8021q + bond (LACP etherchannel)
I am trying to configure LACP Etherchannel Trunk (8021q) from RHEL server to Cumulus Linux. I can get it up and running and stable but the initial setup is very frustrating. It does not seem consistent or automatable to be able to bring up a bonded trunk.
Lets say you have Cumulus Linux switch
CL SWP1-> eth1 RHEL
CL SWP2->eth2 RHEL
On the RHEL side->
/etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE=eth1
NAME=bond0-slave
TYPE=Ethernet
BOOTPROTO=none
ONBOOT=yes
MASTER=bond0
SLAVE=yes
For the bond
/etc/sysconfig/network-scripts/ifcfg-bond0
DEVICE=bond0
NAME=bond0
TYPE=Bond
BONDING_MASTER=yes
IPADDR=1.1.1.1
PREFIX=24
ONBOOT=yes
BOOTPROTO=none
BONDING_OPTS="mode=4 miimon=100 lacp_rate=1 xmit_hash_policy=layer3+4"
For this example I will just do one VLAN, but you can have obviously thousands of these
/etc/sysconfig/network-scripts/ifcfg-bond0.400
DEVICE=bond0.400
IPADDR=10.11.0.100
PREFIX=24
ONBOOT=yes
BOOTPROTO=none
VLAN=yes
So with these flat files we do this->
modprobe bonding
modprobe 8021q
ifup ifcfg-eth1
This is where the problem is->
Error: Connection activation failed: Master device bond0 unmanaged or not available for activation
So even if I try to up the bond0 first (and this is with vagrant so I can destroy and start from scratch)
Error: Connection activation failed: Master device bond0 unmanaged or not available for activation
So its like a chicken and egg thing? I have found that I have to 'jiggle' the bond. I literally can't figure out the magic combination to get it up. Once its up it will work great and stay up unless i do a vagrant destroy on the RHEL VM.
One time to get it working I did a
ifdown bond0
ip link del bond0
ifdown eth1
ifup eth1
Then it came online magically. But this will not happen every time. Once its up I can do a nmcli conn reload and its stable... but this initial setup seems very sketchy/flaky and makes using Ansible impossible since I basically have to ignore_errors: yes on everything :(
What should I do here? What additional information do you need? The bond on the Cumulus Linux is not the problem (bonds to Cisco, Arista, Debian, Ubuntu, etc) so this is definitely a RHEL/CentOS thing.
Responses
Your config files look good, and you shouldn't have to modprobe anything.
If you're using the initscripts, then just ifup bond0 or service network restart should do it.
If you're using NetworkManager, I think you could just nmcli con up bond0 or maybe one of the slaves. Personally I would use nmtui.
You should be able to name a bond bond0 with NetworkManager, I have RHEL7 systems setup that way.
It's possible that modprobing the bonding module first has created an interface called bond0 and NM is running into some problems naming a new connection (the NM abstraction) over an existing device.
It's not too convenient, but I think with your current config files you could probably just reboot and have everything come up as bond0.
Ah, that's in Section 4.4 about using commandline. Section 4.3 about using nmcli doesn't list such a requirement.
From an updated RHEL 7 system with no bonding module loaded, following Section 4.3, I was able to get a bond named bond0 up and working fine:
# nmcli con add type bond con-name bond0 ifname bond0 mode active-backup
# nmcli con add type bond-slave ifname eth1 master bond0
# nmcli con up bond-slave-eth1
# nmcli con up bond0
These steps create the ifcfg- files for the bond and slaves too.
My recommendation would be to use nmtui if possible, nmcli if not.
The previous author of the RHEL 7 Networking Guide definitely did test everything in there, though it's possible the behaviour of NM has changed with the commandline steps since RHEL 7.0.
If you can supply a reproducible set of steps showing where the doc is wrong or doesn't behave as desired, please do log a bug against the doc-Networking_Guide component and the current author(s) can review the steps.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
