The NVIDIA BlueField-2 DPU SoC requires an errata kernel for kdump to pass

Solution In Progress - Updated -

Issue

Under kdump environment, mlx5 driver tries to use minimal resources as possible since the RAM is very limited. And when using a device that supports TLS offload (like ConnectX-6 Dx or BlueFiled-2), the stop room will be higher than the SQ size and so the vmcore copy over NFS fails as it won't have a working interface.

[   18.477401] mlx5_core 0000:03:00.1 enp3s0f1: Stop room 95 is bigger than the SQ size 64

[   17.990167] WARNING: CPU: 0 PID: 479 at drivers/net/ethernet/mellanox/mlx5/core/en_main.c:1131 mlx5e_open_sqs+0x49c/0x508 [mlx5_core]
[   17.727583] Call trace:                                 
[   17.732526]  mlx5e_open_sqs+0x49c/0x508 [mlx5_core]     
[   17.742360]  mlx5e_open_channels+0x784/0x938 [mlx5_core]
[   17.753066]  mlx5e_open_locked+0x44/0xb8 [mlx5_core]    
[   17.763075]  mlx5e_open+0x38/0x88 [mlx5_core]           
[   17.771827]  __dev_open+0xf8/0x190                      
[   17.778648]  __dev_change_flags+0x1a0/0x208             
[   17.787038]  dev_change_flags+0x3c/0x78                 
[   17.794730]  do_setlink+0x2a0/0xc88                     
[   17.801724]  __rtnl_newlink+0x5e4/0x700                 
[   17.809411]  rtnl_newlink+0x58/0x80                     
[   17.816401]  rtnetlink_rcv_msg+0x230/0x2f8              
[   17.824615]  netlink_rcv_skb+0x60/0x120                 
[   17.832305]  rtnetlink_rcv+0x28/0x38                    
[   17.839472]  netlink_unicast+0x1d0/0x260                
[   17.847334]  netlink_sendmsg+0x1b4/0x358                
[   17.855201]  sock_sendmsg+0x4c/0x68                     
[   17.862193]  ____sys_sendmsg+0x200/0x240                
[   17.870058]  ___sys_sendmsg+0x90/0xd0                   
[   17.877401]  __sys_sendmsg+0x68/0xb0                    
[   17.884567]  __arm64_sys_sendmsg+0x2c/0x38              
[   17.892785]  el0_svc_handler+0xb0/0x180                 
[   17.900478]  el0_svc+0x8/0xc                            

[   17.441498] mount[495]: mount.nfs: Network is unreachable

Environment

  • Red Hat Enterprise Linux 8.4
  • NVIDIA BlueField-2 DPU SoC
  • aarch64 platform

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content