Why ring buffer changes fails to get applied with " mlx5_buf_alloc() failed, -12" for mlx5 based NICs in RHEL7?

Solution Unverified - Updated -

Issue

  • Why ring buffer changes fails to get applied with mlx5_buf_alloc() failed, -12 for mlx5 based NICs in RHEL7?
  • When trying to increment the ring buffer value for mellanox mlx5 based NICs following traces appears in the message logs and the NIC goes offline:
Feb 20 09:55:14 localhost kernel: ------------[ cut here ]------------
Feb 20 09:55:14 localhost kernel: WARNING: at mm/page_alloc.c:2901 __alloc_pages_slowpath+0x6f/0x725()
Feb 20 09:55:14 localhost kernel: Modules linked in: appex(POE) nfnetlink_log bluetooth rfkill nfnetlink_queue xt_pkttype xt_TPROXY ip_set_hash_ip xt_NFQUEUE ip6table_m
angle xt_set ip_set nfnetlink xt_socket nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle xt_mark dccp_diag dccp tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag
 xt_conntrack nf_conntrack tun bridge ebtable_filter ebtables ip6table_filter ip6_tables 8021q garp mrp stp llc team_mode_loadbalance team ext4 mbcache jbd2 intel_powerclamp
 coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt iTCO_vendor_support p
cspkr sb_edac edac_core sg lpc_ich i2c_i801 hpwdt hpilo ipmi_ssif shpchp wmi ipmi_si ipmi_msghandler pcc_cpufreq acpi_power_meter nfsd auth_rpcgss
Feb 20 09:55:14 localhost kernel: nfs_acl lockd grace sunrpc binfmt_misc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper
 syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm i2c_core crct10dif_pclmul crct10dif_common crc32c_intel serio_raw mlx5_core tg3 hpsa ptp pps_core scsi_transport_sas f
jes dm_mirror dm_region_hash dm_log dm_mod [last unloaded: appex]
Feb 20 09:55:14 localhost kernel: CPU: 0 PID: 49956 Comm: ethtool Tainted: P           OE  ------------   3.10.0-514.16.1.el7.x86_64 #1
Feb 20 09:55:14 localhost kernel: Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 04/25/2017
Feb 20 09:55:14 localhost kernel: 0000000000000000 000000003aa91c69 ffff88150e673870 ffffffff816869c3
Feb 20 09:55:14 localhost kernel: ffff88150e6738a8 ffffffff81085cb0 00000000000000d0 00000000000080d0
Feb 20 09:55:14 localhost kernel: ffff88107ffda000 0000000000000000 00000000000280d0 ffff88150e6738b8
Feb 20 09:55:14 localhost kernel: Call Trace:
Feb 20 09:55:14 localhost kernel: [<ffffffff816869c3>] dump_stack+0x19/0x1b
Feb 20 09:55:14 localhost kernel: [<ffffffff81085cb0>] warn_slowpath_common+0x70/0xb0
Feb 20 09:55:14 localhost kernel: [<ffffffff81085dfa>] warn_slowpath_null+0x1a/0x20
Feb 20 09:55:14 localhost kernel: [<ffffffff81681f0f>] __alloc_pages_slowpath+0x6f/0x725
Feb 20 09:55:14 localhost kernel: [<ffffffff8118b5c5>] __alloc_pages_nodemask+0x405/0x420
Feb 20 09:55:14 localhost kernel: [<ffffffff81030fcf>] dma_generic_alloc_coherent+0x8f/0x140
Feb 20 09:55:14 localhost kernel: [<ffffffff81062001>] x86_swiotlb_alloc_coherent+0x21/0x50
Feb 20 09:55:14 localhost kernel: [<ffffffffa0141bfd>] mlx5_dma_zalloc_coherent_node+0xad/0x110 [mlx5_core]
Feb 20 09:55:14 localhost kernel: [<ffffffffa0141f7d>] mlx5_buf_alloc_node+0x4d/0xc0 [mlx5_core]
Feb 20 09:55:14 localhost kernel: [<ffffffffa014c5f1>] mlx5_cqwq_create+0x81/0x170 [mlx5_core]
Feb 20 09:55:14 localhost kernel: [<ffffffffa0150e96>] mlx5e_open_cq+0xa6/0x200 [mlx5_core]
Feb 20 09:55:14 localhost kernel: [<ffffffffa0152b4e>] mlx5e_open_locked+0x6ee/0xd80 [mlx5_core]
Feb 20 09:55:14 localhost kernel: [<ffffffff81189ffd>] ? __free_pages+0x1d/0x30
Feb 20 09:55:14 localhost kernel: [<ffffffff8118a212>] ? __free_memcg_kmem_pages+0x22/0x50
Feb 20 09:55:14 localhost kernel: [<ffffffffa01578fe>] mlx5e_set_ringparam+0x15e/0x280 [mlx5_core]
Feb 20 09:55:14 localhost kernel: [<ffffffff81578c43>] dev_ethtool+0xf73/0x1e70
Feb 20 09:55:14 localhost kernel: [<ffffffff81183775>] ? filemap_fault+0x215/0x410
Feb 20 09:55:14 localhost kernel: [<ffffffffa035c63c>] ? xfs_iunlock+0x11c/0x130 [xfs]
Feb 20 09:55:14 localhost kernel: [<ffffffff811f3756>] ? mem_cgroup_update_page_stat+0x16/0x50
Feb 20 09:55:14 localhost kernel: [<ffffffff811bb4cc>] ? page_add_file_rmap+0x8c/0xc0
Feb 20 09:55:14 localhost kernel: [<ffffffff8156eb89>] ? dev_get_by_name_rcu+0x69/0x90
Feb 20 09:55:14 localhost kernel: [<ffffffff8158947f>] dev_ioctl+0x1cf/0x590
Feb 20 09:55:14 localhost kernel: [<ffffffff815529c5>] sock_do_ioctl+0x45/0x50
Feb 20 09:55:14 localhost kernel: [<ffffffff815530d0>] sock_ioctl+0x1f0/0x2c0
Feb 20 09:55:14 localhost kernel: [<ffffffff81212555>] do_vfs_ioctl+0x2d5/0x4b0
Feb 20 09:55:14 localhost kernel: [<ffffffff81692561>] ? __do_page_fault+0x171/0x450
Feb 20 09:55:14 localhost kernel: [<ffffffff812127d1>] SyS_ioctl+0xa1/0xc0
Feb 20 09:55:14 localhost kernel: [<ffffffff81697089>] system_call_fastpath+0x16/0x1b
Feb 20 09:55:14 localhost kernel: ---[ end trace d697aa0c228e1734 ]---
Feb 20 09:55:14 localhost kernel: mlx5_core 0000:05:00.0: 0000:05:00.0:mlx5_cqwq_create:121:(pid 49956): mlx5_buf_alloc() failed, -12
Feb 20 09:55:14 localhost kernel: mlx5_core 0000:05:00.0 ens2f0: mlx5e_open_locked: mlx5e_open_channels failed, -12
Feb 20 09:55:14 localhost teamd: ens2f0: ethtool-link went down.

Environment

  • Red Hat Enterprise Linux (RHEL) 7.3.
  • Mellanox Technologies MT27710 Family [ConnectX-4 Lx] NICs.
  • mlx5_core drivers

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content