RHEL8.0 and 8.1 - hw csum failure seen in dmesg and console (using mlx5/Mellanox)

Solution In Progress - Updated -

Issue

RHEL8.0 and 8.1 - hw csum failure seen in dmesg and console (using mlx5/Mellanox)

One can see:

cat ./sos_commands/logs/journalctl_--no-pager_--catalog_--boot
(...)
-- The start-up result is RESULT.
Dec 18 14:16:46 host-0 kernel: br-ctlplane: hw csum failure
Dec 18 14:16:46 host-0 kernel: CPU: 23 PID: 0 Comm: swapper/23 Kdump: loaded Not tainted 4.18.0-147.0.3.el8_1.x86_64 #1
Dec 18 14:16:46 host-0 kernel: Hardware name: Dell Inc. PowerEdge R640/XXXXXX, BIOS 2.4.8 11/26/2019
Dec 18 14:16:46 host-0 kernel: Call Trace:
Dec 18 14:16:46 host-0 kernel:  <IRQ>
Dec 18 14:16:46 host-0 kernel:  dump_stack+0x5c/0x80
Dec 18 14:16:46 host-0 kernel:  __skb_checksum_complete+0xac/0xc0
Dec 18 14:16:46 host-0 kernel:  icmpv6_error+0x25e/0x2bc [nf_conntrack_ipv6]
Dec 18 14:16:46 host-0 kernel:  nf_conntrack_in+0xf5/0x520 [nf_conntrack]
Dec 18 14:16:46 host-0 kernel:  ? nft_do_chain_ipv6+0x7e/0xb0 [nf_tables]
Dec 18 14:16:46 host-0 kernel:  nf_hook_slow+0x44/0xc0
Dec 18 14:16:46 host-0 kernel:  ipv6_rcv+0x437/0x4e0
Dec 18 14:16:46 host-0 kernel:  ? ip6_make_skb+0x1d0/0x1d0
Dec 18 14:16:46 host-0 kernel:  __netif_receive_skb_core+0x57e/0xbd0
Dec 18 14:16:46 host-0 kernel:  ? mlx5e_poll_rx_cq+0xd5/0x920 [mlx5_core]
Dec 18 14:16:46 host-0 kernel:  ? ktime_get+0x36/0xa0
Dec 18 14:16:46 host-0 kernel:  ? mlx5e_napi_poll+0x263/0xce0 [mlx5_core]
Dec 18 14:16:46 host-0 kernel:  process_backlog+0xa6/0x160
Dec 18 14:16:46 host-0 kernel:  net_rx_action+0x149/0x3b0
Dec 18 14:16:46 host-0 kernel:  __do_softirq+0xe3/0x30a
Dec 18 14:16:46 host-0 kernel:  irq_exit+0x100/0x110
Dec 18 14:16:46 host-0 kernel:  do_IRQ+0x85/0xd0
Dec 18 14:16:46 host-0 kernel:  common_interrupt+0xf/0xf
Dec 18 14:16:46 host-0 kernel:  </IRQ>
Dec 18 14:16:46 host-0 kernel: RIP: 0010:cpuidle_enter_state+0xb7/0x2a0
Dec 18 14:16:46 host-0 kernel: Code: e8 ae 6e a5 ff 80 7c 24 03 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 d7 01 00 00 31 ff e8 90 5d ab ff fb 66 0f 1f 44 00 00 <48> b8 ff ff ff ff f3 01 00 00 4c 29 f3 ba ff ff ff 7f 48 39 c3 7f
Dec 18 14:16:46 host-0 kernel: RSP: 0018:ffffb59886773e98 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd5
Dec 18 14:16:46 host-0 kernel: RAX: ffff9b35a0de3100 RBX: 00000026689eab4b RCX: 000000000000001f
Dec 18 14:16:46 host-0 kernel: RDX: 00000026689eab4b RSI: 000000003d1879ab RDI: 0000000000000000
Dec 18 14:16:46 host-0 kernel: RBP: 0000000000000003 R08: 0000000000000004 R09: 0000000000022940
Dec 18 14:16:46 host-0 kernel: R10: 000000aecbc39a3a R11: ffff9b35a0de20a8 R12: ffff9b35a0dedd18
Dec 18 14:16:46 host-0 kernel: R13: ffffffff9c125538 R14: 00000026680cfe2a R15: 0000000000000000
Dec 18 14:16:46 host-0 kernel:  ? cpuidle_enter_state+0x92/0x2a0
Dec 18 14:16:46 host-0 kernel:  do_idle+0x236/0x280
Dec 18 14:16:46 host-0 kernel:  cpu_startup_entry+0x6f/0x80
Dec 18 14:16:46 host-0 kernel:  start_secondary+0x1a7/0x200
Dec 18 14:16:46 host-0 kernel:  secondary_startup_64+0xb7/0xc0
Dec 18 14:16:46 host-0 kernel: br-ctlplane: hw csum failure
Dec 18 14:16:46 host-0 kernel: CPU: 25 PID: 0 Comm: swapper/25 Kdump: loaded Not tainted 4.18.0-147.0.3.el8_1.x86_64 #1
Dec 18 14:16:46 host-0 kernel: Hardware name: Dell Inc. PowerEdge R640/XXXXXXXX, BIOS 2.4.8 11/26/2019
Dec 18 14:16:46 host-0 kernel: Call Trace:
Dec 18 14:16:46 host-0 kernel:  <IRQ>
Dec 18 14:16:46 host-0 kernel:  dump_stack+0x5c/0x80
Dec 18 14:16:46 host-0 kernel:  __skb_checksum_complete+0xac/0xc0
Dec 18 14:16:46 host-0 kernel:  icmpv6_error+0x25e/0x2bc [nf_conntrack_ipv6]
Dec 18 14:16:46 host-0 kernel:  nf_conntrack_in+0xf5/0x520 [nf_conntrack]
Dec 18 14:16:46 host-0 kernel:  ? nft_do_chain_ipv6+0x7e/0xb0 [nf_tables]
Dec 18 14:16:46 host-0 kernel:  nf_hook_slow+0x44/0xc0
Dec 18 14:16:46 host-0 kernel:  ipv6_rcv+0x437/0x4e0
Dec 18 14:16:46 host-0 kernel:  ? ip6_make_skb+0x1d0/0x1d0
Dec 18 14:16:46 host-0 kernel:  __netif_receive_skb_core+0x57e/0xbd0
Dec 18 14:16:46 host-0 kernel:  ? mlx5e_poll_rx_cq+0xd5/0x920 [mlx5_core]
Dec 18 14:16:46 host-0 kernel:  ? ktime_get+0x36/0xa0
Dec 18 14:16:46 host-0 kernel:  ? mlx5e_napi_poll+0x263/0xce0 [mlx5_core]
Dec 18 14:16:46 host-0 kernel:  process_backlog+0xa6/0x160
Dec 18 14:16:46 host-0 kernel:  net_rx_action+0x149/0x3b0
Dec 18 14:16:46 host-0 kernel:  __do_softirq+0xe3/0x30a
Dec 18 14:16:46 host-0 kernel:  irq_exit+0x100/0x110
Dec 18 14:16:46 host-0 kernel:  do_IRQ+0x85/0xd0
Dec 18 14:16:46 host-0 kernel:  common_interrupt+0xf/0xf
Dec 18 14:16:46 host-0 kernel:  </IRQ>
Dec 18 14:16:46 host-0 kernel: RIP: 0010:cpuidle_enter_state+0xb7/0x2a0
Dec 18 14:16:46 host-0 kernel: Code: e8 ae 6e a5 ff 80 7c 24 03 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 d7 01 00 00 31 ff e8 90 5d ab ff fb 66 0f 1f 44 00 00 <48> b8 ff ff ff ff f3 01 00 00 4c 29 f3 ba ff ff ff 7f 48 39 c3 7f
Dec 18 14:16:46 host-0 kernel: RSP: 0018:ffffb59886783e98 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd8
Dec 18 14:16:46 host-0 kernel: RAX: ffff9b35a0e63100 RBX: 00000026947c130d RCX: 000000000000001f
Dec 18 14:16:46 host-0 kernel: RDX: 00000026947c130d RSI: 000000003d1879ab RDI: 0000000000000000
Dec 18 14:16:46 host-0 kernel: RBP: 0000000000000003 R08: 0000000000000004 R09: 0000000000022940
Dec 18 14:16:46 host-0 kernel: R10: 000000af27aa123c R11: ffff9b35a0e620a8 R12: ffff9b35a0e6dd18
Dec 18 14:16:46 host-0 kernel: R13: ffffffff9c125538 R14: 0000002690a80be3 R15: 0000000000000000
Dec 18 14:16:46 host-0 kernel:  ? cpuidle_enter_state+0x92/0x2a0
Dec 18 14:16:46 host-0 kernel:  do_idle+0x236/0x280
Dec 18 14:16:46 host-0 kernel:  cpu_startup_entry+0x6f/0x80
Dec 18 14:16:46 host-0 kernel:  start_secondary+0x1a7/0x200
Dec 18 14:16:46 host-0 kernel:  secondary_startup_64+0xb7/0xc0
Dec 18 14:16:46 host-0 kernel: br-ctlplane: hw csum failure
Dec 18 14:16:46 host-0 kernel: CPU: 25 PID: 0 Comm: swapper/25 Kdump: loaded Not tainted 4.18.0-147.0.3.el8_1.x86_64 #1
Dec 18 14:16:46 host-0 kernel: Hardware name: Dell Inc. PowerEdge R640/XXXXXX, BIOS 2.4.8 11/26/2019
Dec 18 14:16:46 host-0 kernel: Call Trace:
Dec 18 14:16:46 host-0 kernel:  <IRQ>
Dec 18 14:16:46 host-0 kernel:  dump_stack+0x5c/0x80
Dec 18 14:16:46 host-0 kernel:  __skb_checksum_complete+0xac/0xc0
Dec 18 14:16:46 host-0 kernel:  icmpv6_error+0x25e/0x2bc [nf_conntrack_ipv6]
Dec 18 14:16:46 host-0 kernel:  nf_conntrack_in+0xf5/0x520 [nf_conntrack]
Dec 18 14:16:46 host-0 kernel:  ? nft_do_chain_ipv6+0x7e/0xb0 [nf_tables]
Dec 18 14:16:46 host-0 kernel:  nf_hook_slow+0x44/0xc0
Dec 18 14:16:46 host-0 kernel:  ipv6_rcv+0x437/0x4e0
Dec 18 14:16:46 host-0 kernel:  ? ip6_make_skb+0x1d0/0x1d0
Dec 18 14:16:46 host-0 kernel:  __netif_receive_skb_core+0x57e/0xbd0
Dec 18 14:16:46 host-0 kernel:  ? mlx5e_poll_rx_cq+0xd5/0x920 [mlx5_core]
Dec 18 14:16:46 host-0 kernel:  ? ktime_get+0x36/0xa0
Dec 18 14:16:46 host-0 kernel:  ? mlx5e_napi_poll+0x263/0xce0 [mlx5_core]
Dec 18 14:16:46 host-0 kernel:  process_backlog+0xa6/0x160
Dec 18 14:16:46 host-0 kernel:  net_rx_action+0x149/0x3b0
Dec 18 14:16:46 host-0 kernel:  __do_softirq+0xe3/0x30a
Dec 18 14:16:46 host-0 kernel:  irq_exit+0x100/0x110
Dec 18 14:16:46 host-0 kernel:  do_IRQ+0x85/0xd0
Dec 18 14:16:46 host-0 kernel:  common_interrupt+0xf/0xf
Dec 18 14:16:46 host-0 kernel:  </IRQ>
Dec 18 14:16:46 host-0 kernel: RIP: 0010:cpuidle_enter_state+0xb7/0x2a0
grep 'System Information' ./sos_commands/hardware/dmidecode -A 9
System Information
    Manufacturer: Dell Inc.
    Product Name: PowerEdge R640
    Version: Not Specified
    Serial Number: xxx
    UUID: xxx
    Wake-up Type: Power Switch
    SKU Number: SKU=NotProvided;ModelName=PowerEdge R640
    Family: PowerEdge

grep b3:00.1 ./sos_commands/pci/lspci_-nnvv -A100
b3:00.1 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] [15b3:1015]
    Subsystem: Mellanox Technologies Device [15b3:0016]

Environment

  • Red Hat Enterprise Linux 8.0 and 8.1

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content