LACP mode 4 bonding not coming up on RHEL 7

Latest response

Hi Experts

We followed RHEL documentation to configure bonding on RHEL7. However, the interfaces don't pass any traffic, not even ARP. Please advise.

What does "port state=63" and "port key=13" mean? Bond1 has 10 GBps slave interfaces and has this issue. Bond0 has 1 Gbps slave interfaces and works fine in fault-tolerant mode.

# cat ifcfg-bond1
NAME=bond1
DEVICE=bond1
BONDING_MASTER=yes
TYPE=Bond
ONBOOT=yes
BOOTPROTO=none
BONDING_OPTS="mode=4 miimon=100 lacp_rate=1"
NM_CONTROLLED=no
#  cat /sys/class/net/bond1/bonding/mode
802.3ad 4

# cat /proc/net/bonding/bond1
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: fast
Min links: 0
Aggregator selection policy (ad_select): stable
System priority: 65535
System MAC address: 3c:fd:fe:a7:d8:b8
Active Aggregator Info:
    Aggregator ID: 1
    Number of ports: 2
    Actor Key: 13
    Partner Key: 26250
    Partner Mac Address: 64:0e:94:34:06:ec

Slave Interface: ens6f0
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 3c:fd:fe:a7:d8:b8
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: none
Partner Churn State: none
Actor Churned Count: 0
Partner Churned Count: 0
details actor lacp pdu:
    system priority: 65535
    system mac address: 3c:fd:fe:a7:d8:b8
    port key: 13
    port priority: 255
    port number: 1
    port state: 63
details partner lacp pdu:
    system priority: 32768
    system mac address: 64:0e:94:34:06:ec
    oper key: 26250
    port priority: 32768
    port number: 72
    port state: 63

Slave Interface: ens6f2
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 3c:fd:fe:a7:d8:ba
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: none
Partner Churn State: none
Actor Churned Count: 0
Partner Churned Count: 0
details actor lacp pdu:
    system priority: 65535
    system mac address: 3c:fd:fe:a7:d8:b8
    port key: 13
    port priority: 255
    port number: 2
    port state: 63
details partner lacp pdu:
    system priority: 32768
    system mac address: 64:0e:94:34:06:ec
    oper key: 26250
    port priority: 32768
    port number: 72
    port state: 63

Regards,
Sumanta.

Responses

edited

Hi Sumanta Ghosh, is there any chance whoever has the ability to configure your network switch yet established LACP 802.3ad on the two ports you've connected your system to (on the network switch itself)? Without that, your bonded NIC won't go anywhere.

The solutions at Red Hat are now polluted with nmcli methods instead of editing the files directly. I can come back tomorrow and provide an example.

Regards,

-RJ

The "port state" definitions are:

/* Port state definitions (43.4.2.2 in the 802.3ad standard) */
#define AD_STATE_LACP_ACTIVITY   0x1
#define AD_STATE_LACP_TIMEOUT    0x2
#define AD_STATE_AGGREGATION     0x4
#define AD_STATE_SYNCHRONIZATION 0x8
#define AD_STATE_COLLECTING      0x10
#define AD_STATE_DISTRIBUTING    0x20
#define AD_STATE_DEFAULTED       0x40
#define AD_STATE_EXPIRED         0x80

The keys are just something negotiated between the partners, don't worry about those values.

The bond is in AD_STATE_DEFAULTED which implies that LACPDUs used to work but aren't working anymore.

You're running in Fast Mode (LACPDU every 1 second) with lacp_rate=1. Is the switch also in Fast Mode, or is the switch in Slow Mode (LACPDU every 30 seconds)? Both ends need to match.

Search Google for seaman_1_0399.pdf for an IEEE paper which describes the protocol and state machine as the standard defines it if you wish to gain a thorough understanding.

Hi All

Thanks for the response. Switch side LACP is already done with fast mode. I am not sure if the issue is on the server side or switch.

Regards Sumanta.

Capture on the slave interfaces for a couple of minutes. Filter LACPDUs in Wireshark with filter slow for "Slow Protocols". That'll let you see which side is/isn't sending and what the flags are.

Same issue and I confirm switch is in fast mode

No solutions found in this thread? I am also having same issue. LACP already configured on cisco and unable to ping VMs that are running on bond which is configured as mode 4