RHEL8.0 and 8.1 - hw csum failure seen in dmesg and console (using mlx5/Mellanox)

Solution In Progress - Updated -

Issue

RHEL8.0 and 8.1 - hw csum failure seen in dmesg and console (using mlx5/Mellanox)

One can see:

cat ./sos_commands/logs/journalctl_--no-pager_--catalog_--boot
(...)
-- The start-up result is RESULT.
Dec 18 14:16:46 host-0 kernel: br-ctlplane: hw csum failure
Dec 18 14:16:46 host-0 kernel: CPU: 23 PID: 0 Comm: swapper/23 Kdump: loaded Not tainted 4.18.0-147.0.3.el8_1.x86_64 #1
Dec 18 14:16:46 host-0 kernel: Hardware name: Dell Inc. PowerEdge R640/XXXXXX, BIOS 2.4.8 11/26/2019
Dec 18 14:16:46 host-0 kernel: Call Trace:
Dec 18 14:16:46 host-0 kernel:  <IRQ>
Dec 18 14:16:46 host-0 kernel:  dump_stack+0x5c/0x80
Dec 18 14:16:46 host-0 kernel:  __skb_checksum_complete+0xac/0xc0
Dec 18 14:16:46 host-0 kernel:  icmpv6_error+0x25e/0x2bc [nf_conntrack_ipv6]
Dec 18 14:16:46 host-0 kernel:  nf_conntrack_in+0xf5/0x520 [nf_conntrack]
Dec 18 14:16:46 host-0 kernel:  ? nft_do_chain_ipv6+0x7e/0xb0 [nf_tables]
Dec 18 14:16:46 host-0 kernel:  nf_hook_slow+0x44/0xc0
Dec 18 14:16:46 host-0 kernel:  ipv6_rcv+0x437/0x4e0
Dec 18 14:16:46 host-0 kernel:  ? ip6_make_skb+0x1d0/0x1d0
Dec 18 14:16:46 host-0 kernel:  __netif_receive_skb_core+0x57e/0xbd0
Dec 18 14:16:46 host-0 kernel:  ? mlx5e_poll_rx_cq+0xd5/0x920 [mlx5_core]
Dec 18 14:16:46 host-0 kernel:  ? ktime_get+0x36/0xa0
Dec 18 14:16:46 host-0 kernel:  ? mlx5e_napi_poll+0x263/0xce0 [mlx5_core]
Dec 18 14:16:46 host-0 kernel:  process_backlog+0xa6/0x160
Dec 18 14:16:46 host-0 kernel:  net_rx_action+0x149/0x3b0
Dec 18 14:16:46 host-0 kernel:  __do_softirq+0xe3/0x30a
Dec 18 14:16:46 host-0 kernel:  irq_exit+0x100/0x110
Dec 18 14:16:46 host-0 kernel:  do_IRQ+0x85/0xd0
Dec 18 14:16:46 host-0 kernel:  common_interrupt+0xf/0xf
Dec 18 14:16:46 host-0 kernel:  </IRQ>
Dec 18 14:16:46 host-0 kernel: RIP: 0010:cpuidle_enter_state+0xb7/0x2a0
Dec 18 14:16:46 host-0 kernel: Code: e8 ae 6e a5 ff 80 7c 24 03 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 d7 01 00 00 31 ff e8 90 5d ab ff fb 66 0f 1f 44 00 00 <48> b8 ff ff ff ff f3 01 00 00 4c 29 f3 ba ff ff ff 7f 48 39 c3 7f
Dec 18 14:16:46 host-0 kernel: RSP: 0018:ffffb59886773e98 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd5
Dec 18 14:16:46 host-0 kernel: RAX: ffff9b35a0de3100 RBX: 00000026689eab4b RCX: 000000000000001f
Dec 18 14:16:46 host-0 kernel: RDX: 00000026689eab4b RSI: 000000003d1879ab RDI: 0000000000000000
Dec 18 14:16:46 host-0 kernel: RBP: 0000000000000003 R08: 0000000000000004 R09: 0000000000022940
Dec 18 14:16:46 host-0 kernel: R10: 000000aecbc39a3a R11: ffff9b35a0de20a8 R12: ffff9b35a0dedd18
Dec 18 14:16:46 host-0 kernel: R13: ffffffff9c125538 R14: 00000026680cfe2a R15: 0000000000000000
Dec 18 14:16:46 host-0 kernel:  ? cpuidle_enter_state+0x92/0x2a0
Dec 18 14:16:46 host-0 kernel:  do_idle+0x236/0x280
Dec 18 14:16:46 host-0 kernel:  cpu_startup_entry+0x6f/0x80
Dec 18 14:16:46 host-0 kernel:  start_secondary+0x1a7/0x200
Dec 18 14:16:46 host-0 kernel:  secondary_startup_64+0xb7/0xc0
Dec 18 14:16:46 host-0 kernel: br-ctlplane: hw csum failure
Dec 18 14:16:46 host-0 kernel: CPU: 25 PID: 0 Comm: swapper/25 Kdump: loaded Not tainted 4.18.0-147.0.3.el8_1.x86_64 #1
Dec 18 14:16:46 host-0 kernel: Hardware name: Dell Inc. PowerEdge R640/XXXXXXXX, BIOS 2.4.8 11/26/2019
Dec 18 14:16:46 host-0 kernel: Call Trace:
Dec 18 14:16:46 host-0 kernel:  <IRQ>
Dec 18 14:16:46 host-0 kernel:  dump_stack+0x5c/0x80
Dec 18 14:16:46 host-0 kernel:  __skb_checksum_complete+0xac/0xc0
Dec 18 14:16:46 host-0 kernel:  icmpv6_error+0x25e/0x2bc [nf_conntrack_ipv6]
Dec 18 14:16:46 host-0 kernel:  nf_conntrack_in+0xf5/0x520 [nf_conntrack]
Dec 18 14:16:46 host-0 kernel:  ? nft_do_chain_ipv6+0x7e/0xb0 [nf_tables]
Dec 18 14:16:46 host-0 kernel:  nf_hook_slow+0x44/0xc0
Dec 18 14:16:46 host-0 kernel:  ipv6_rcv+0x437/0x4e0
Dec 18 14:16:46 host-0 kernel:  ? ip6_make_skb+0x1d0/0x1d0
Dec 18 14:16:46 host-0 kernel:  __netif_receive_skb_core+0x57e/0xbd0
Dec 18 14:16:46 host-0 kernel:  ? mlx5e_poll_rx_cq+0xd5/0x920 [mlx5_core]
Dec 18 14:16:46 host-0 kernel:  ? ktime_get+0x36/0xa0
Dec 18 14:16:46 host-0 kernel:  ? mlx5e_napi_poll+0x263/0xce0 [mlx5_core]
Dec 18 14:16:46 host-0 kernel:  process_backlog+0xa6/0x160
Dec 18 14:16:46 host-0 kernel:  net_rx_action+0x149/0x3b0
Dec 18 14:16:46 host-0 kernel:  __do_softirq+0xe3/0x30a
Dec 18 14:16:46 host-0 kernel:  irq_exit+0x100/0x110
Dec 18 14:16:46 host-0 kernel:  do_IRQ+0x85/0xd0
Dec 18 14:16:46 host-0 kernel:  common_interrupt+0xf/0xf
Dec 18 14:16:46 host-0 kernel:  </IRQ>
Dec 18 14:16:46 host-0 kernel: RIP: 0010:cpuidle_enter_state+0xb7/0x2a0
Dec 18 14:16:46 host-0 kernel: Code: e8 ae 6e a5 ff 80 7c 24 03 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 d7 01 00 00 31 ff e8 90 5d ab ff fb 66 0f 1f 44 00 00 <48> b8 ff ff ff ff f3 01 00 00 4c 29 f3 ba ff ff ff 7f 48 39 c3 7f
Dec 18 14:16:46 host-0 kernel: RSP: 0018:ffffb59886783e98 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd8
Dec 18 14:16:46 host-0 kernel: RAX: ffff9b35a0e63100 RBX: 00000026947c130d RCX: 000000000000001f
Dec 18 14:16:46 host-0 kernel: RDX: 00000026947c130d RSI: 000000003d1879ab RDI: 0000000000000000
Dec 18 14:16:46 host-0 kernel: RBP: 0000000000000003 R08: 0000000000000004 R09: 0000000000022940
Dec 18 14:16:46 host-0 kernel: R10: 000000af27aa123c R11: ffff9b35a0e620a8 R12: ffff9b35a0e6dd18
Dec 18 14:16:46 host-0 kernel: R13: ffffffff9c125538 R14: 0000002690a80be3 R15: 0000000000000000
Dec 18 14:16:46 host-0 kernel:  ? cpuidle_enter_state+0x92/0x2a0
Dec 18 14:16:46 host-0 kernel:  do_idle+0x236/0x280
Dec 18 14:16:46 host-0 kernel:  cpu_startup_entry+0x6f/0x80
Dec 18 14:16:46 host-0 kernel:  start_secondary+0x1a7/0x200
Dec 18 14:16:46 host-0 kernel:  secondary_startup_64+0xb7/0xc0
Dec 18 14:16:46 host-0 kernel: br-ctlplane: hw csum failure
Dec 18 14:16:46 host-0 kernel: CPU: 25 PID: 0 Comm: swapper/25 Kdump: loaded Not tainted 4.18.0-147.0.3.el8_1.x86_64 #1
Dec 18 14:16:46 host-0 kernel: Hardware name: Dell Inc. PowerEdge R640/XXXXXX, BIOS 2.4.8 11/26/2019
Dec 18 14:16:46 host-0 kernel: Call Trace:
Dec 18 14:16:46 host-0 kernel:  <IRQ>
Dec 18 14:16:46 host-0 kernel:  dump_stack+0x5c/0x80
Dec 18 14:16:46 host-0 kernel:  __skb_checksum_complete+0xac/0xc0
Dec 18 14:16:46 host-0 kernel:  icmpv6_error+0x25e/0x2bc [nf_conntrack_ipv6]
Dec 18 14:16:46 host-0 kernel:  nf_conntrack_in+0xf5/0x520 [nf_conntrack]
Dec 18 14:16:46 host-0 kernel:  ? nft_do_chain_ipv6+0x7e/0xb0 [nf_tables]
Dec 18 14:16:46 host-0 kernel:  nf_hook_slow+0x44/0xc0
Dec 18 14:16:46 host-0 kernel:  ipv6_rcv+0x437/0x4e0
Dec 18 14:16:46 host-0 kernel:  ? ip6_make_skb+0x1d0/0x1d0
Dec 18 14:16:46 host-0 kernel:  __netif_receive_skb_core+0x57e/0xbd0
Dec 18 14:16:46 host-0 kernel:  ? mlx5e_poll_rx_cq+0xd5/0x920 [mlx5_core]
Dec 18 14:16:46 host-0 kernel:  ? ktime_get+0x36/0xa0
Dec 18 14:16:46 host-0 kernel:  ? mlx5e_napi_poll+0x263/0xce0 [mlx5_core]
Dec 18 14:16:46 host-0 kernel:  process_backlog+0xa6/0x160
Dec 18 14:16:46 host-0 kernel:  net_rx_action+0x149/0x3b0
Dec 18 14:16:46 host-0 kernel:  __do_softirq+0xe3/0x30a
Dec 18 14:16:46 host-0 kernel:  irq_exit+0x100/0x110
Dec 18 14:16:46 host-0 kernel:  do_IRQ+0x85/0xd0
Dec 18 14:16:46 host-0 kernel:  common_interrupt+0xf/0xf
Dec 18 14:16:46 host-0 kernel:  </IRQ>
Dec 18 14:16:46 host-0 kernel: RIP: 0010:cpuidle_enter_state+0xb7/0x2a0
grep 'System Information' ./sos_commands/hardware/dmidecode -A 9
System Information
    Manufacturer: Dell Inc.
    Product Name: PowerEdge R640
    Version: Not Specified
    Serial Number: xxx
    UUID: xxx
    Wake-up Type: Power Switch
    SKU Number: SKU=NotProvided;ModelName=PowerEdge R640
    Family: PowerEdge

grep b3:00.1 ./sos_commands/pci/lspci_-nnvv -A100
b3:00.1 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] [15b3:1015]
    Subsystem: Mellanox Technologies Device [15b3:0016]

Environment

  • Red Hat Enterprise Linux 8.0 and 8.1

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In