bnx2x devices do not fail faulty link
Environment
- Red Hat Enterprise Linux (RHEL) 6.3
- kernels 2.6.32-220.4.2.el6.x86_64, 2.6.32-220.7.1.el6.x86_64
Issue
- On RHEL 6.3, bnx2x devices do not fail link with large number of rx errors and overruns. In situation where two bnx2x devices, eth0 and eth1, are bonded eth0 encounters a large number of rx errors and overruns, yet, ethtool still shows the link detected; the bond never fails over to eth1 which is not experiencing the rx errors and overruns. The problem was seen in a blade enclosure where some of the systems were a mix of RHEL 5 and RHEL 6. The bonds on RHEL 5 were failing over correctly whereas the RHEL 6 bonds were not.
eth0 Link encap:Ethernet HWaddr 00:26:55:1B:7C:08
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:69707637 errors:18823718 dropped:0 overruns:18823718 frame:0
TX packets:17433943 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:12419185456 (11.5 GiB) TX bytes:10349326962 (9.6 GiB)
Interrupt:28 Memory:f5000000-f57fffff
bond1 Link encap:Ethernet HWaddr 00:26:55:1B:7C:08
inet6 addr: fe80::226:55ff:fe1b:7c08/64 Scope:Link
UP BROADCAST RUNNING PROMISC MASTER MULTICAST MTU:1500 Metric:1
RX packets:114978046 errors:18823718 dropped:0 overruns:18823718 frame:0
TX packets:17433943 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:15902641969 (14.8 GiB) TX bytes:10349326962 (9.6 GiB)
Settings for eth0:
Supported ports: [ FIBRE ]
Supported link modes: 1000baseT/Full
2500baseX/Full
10000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 1000baseT/Full
2500baseX/Full
10000baseT/Full
Advertised pause frame use: Symmetric Receive-only
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: FIBRE
PHYAD: 16
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: g
Wake-on: g
Current message level: 0x00000000 (0)
Link detected: yes
Settings for eth1:
Supported ports: [ FIBRE ]
Supported link modes: 1000baseT/Full
2500baseX/Full
10000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 1000baseT/Full
2500baseX/Full
10000baseT/Full
Advertised pause frame use: Symmetric Receive-only
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: FIBRE
PHYAD: 17
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: g
Wake-on: g
Current message level: 0x00000000 (0)
Link detected: yes
Resolution
Update to kernel-2.6.32-358 or later as described in the Errata RHSA-2013:0496-2.
Root Cause
Device never fails and the link is never marked as down, therefore, the bond never fails over to a healthy device.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments