Solarflare sfc NIC reports a transmit timeout and "TX stuck with port_enabled=1: resetting channels" in RHEL 7

Solution Verified - Updated -

Issue

  • Solarflare NIC goes offline with the generic NETDEV WATCHDOG backtrace followed by sfc's TX stuck with port_enabled=1: resetting channels error:

    Mar  3 12:58:14 localhost kernel: ------------[ cut here ]------------
    Mar  3 12:58:14 localhost kernel: WARNING: at net/sched/sch_generic.c:297 dev_watchdog+0x270/0x280()
    Mar  3 12:58:14 localhost kernel: NETDEV WATCHDOG: eno1 (sfc): transmit queue 1 timed out
    Mar  3 12:58:14 localhost kernel: Modules linked in: xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ftp ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ebtable_nat ebtable_broute ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter ip_tables tcp_lp bnep bluetooth rfkill btrfs zlib_deflate raid6_pq xor msdos ext4 mbcache jbd2 binfmt_misc tun fuse team_mode_loadbalance team_mode_activebackup 8021q garp mrp team bridge stp llc iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl kvm_intel kvm pcspkr vfat fat sb_edac
    Mar  3 12:58:14 localhost kernel: hpilo lpc_ich edac_core hpwdt i2c_i801 mfd_core ahci libahci libata ioatdma ipmi_devintf sg shpchp ipmi_si ipmi_msghandler pcc_cpufreq acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace xfs libcrc32c dm_service_time sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel ghash_clmulni_intel syscopyarea aesni_intel sysfillrect sysimgblt lrw gf128mul drm_kms_helper glue_helper ablk_helper cryptd ttm sfc drm igb mtd dca mpt3sas ptp pps_core mdio i2c_algo_bit raid_class i2c_core scsi_transport_sas wmi sunrpc dm_mirror dm_region_hash dm_log scsi_transport_iscsi dm_multipath dm_mod [last unloaded: ip_tables]
    Mar  3 12:58:14 localhost kernel: CPU: 6 PID: 0 Comm: swapper/6 Tainted: P           OE  ------------   3.10.0-327.el7.x86_64 #1
    Mar  3 12:58:14 localhost kernel: Hardware name: HP ProLiant DL180 Gen9, BIOS U20 07/20/2015
    Mar  3 12:58:14 localhost kernel: ffff88087fc03d88 ae563a0f2b70f318 ffff88087fc03d40 ffffffff816351f1
    Mar  3 12:58:14 localhost kernel: ffff88087fc03d78 ffffffff8107b200 0000000000000001 ffff880869a5a000
    Mar  3 12:58:14 localhost kernel: ffff880861894f40 0000000000000040 0000000000000006 ffff88087fc03de0
    Mar  3 12:58:14 localhost kernel: Call Trace:
    Mar  3 12:58:14 localhost kernel: <IRQ>  [<ffffffff816351f1>] dump_stack+0x19/0x1b
    Mar  3 12:58:14 localhost kernel: [<ffffffff8107b200>] warn_slowpath_common+0x70/0xb0
    Mar  3 12:58:14 localhost kernel: [<ffffffff8107b29c>] warn_slowpath_fmt+0x5c/0x80
    Mar  3 12:58:14 localhost kernel: [<ffffffff8154ca90>] dev_watchdog+0x270/0x280
    Mar  3 12:58:14 localhost kernel: [<ffffffff8154c820>] ? dev_graft_qdisc+0x80/0x80
    Mar  3 12:58:14 localhost kernel: [<ffffffff8108b0a6>] call_timer_fn+0x36/0x110
    Mar  3 12:58:14 localhost kernel: [<ffffffff8154c820>] ? dev_graft_qdisc+0x80/0x80
    Mar  3 12:58:14 localhost kernel: [<ffffffff8108dd97>] run_timer_softirq+0x237/0x340
    Mar  3 12:58:14 localhost kernel: [<ffffffff81084b0f>] __do_softirq+0xef/0x280
    Mar  3 12:58:14 localhost kernel: [<ffffffff8164721c>] call_softirq+0x1c/0x30
    Mar  3 12:58:14 localhost kernel: [<ffffffff81016fc5>] do_softirq+0x65/0xa0
    Mar  3 12:58:14 localhost kernel: [<ffffffff81084ea5>] irq_exit+0x115/0x120
    Mar  3 12:58:14 localhost kernel: [<ffffffff81647e95>] smp_apic_timer_interrupt+0x45/0x60
    Mar  3 12:58:14 localhost kernel: [<ffffffff8164655d>] apic_timer_interrupt+0x6d/0x80
    Mar  3 12:58:14 localhost kernel: <EOI>  [<ffffffff814d4552>] ? cpuidle_enter_state+0x52/0xc0
    Mar  3 12:58:14 localhost kernel: [<ffffffff814d4699>] cpuidle_idle_call+0xd9/0x210
    Mar  3 12:58:14 localhost kernel: [<ffffffff8101e4be>] arch_cpu_idle+0xe/0x30
    Mar  3 12:58:14 localhost kernel: [<ffffffff810d6305>] cpu_startup_entry+0x245/0x290
    Mar  3 12:58:14 localhost kernel: [<ffffffff810475fa>] start_secondary+0x1ba/0x230
    Mar  3 12:58:14 localhost kernel: ---[ end trace 5ea2b0fcc2a3a3d9 ]---
    Mar  3 12:58:14 localhost kernel: sfc 0000:81:00.0 eno1: TX stuck with port_enabled=1: resetting channels
    Mar  3 12:58:14 localhost kernel: sfc 0000:81:00.0 eno1: resetting (RECOVER_OR_ALL)
    Mar  3 12:58:14 localhost kernel: sfc 0000:81:00.0 eno1: MC command 0xa2 inlen 132 failed rc=-1 (raw=1) arg=0
    Mar  3 12:58:14 localhost kernel: sfc 0000:81:00.0 eno1: efx_ef10_rx_push_exclusive_rss_config: failed rc=-1
    Mar  3 12:58:14 localhost kernel: sfc 0000:81:00.0 eno1: MC command 0x80 inlen 100 failed rc=-22 (raw=22) arg=21
    Mar  3 12:58:14 localhost kernel: sfc 0000:81:00.0 eno1: has been disabled
    

Environment

  • Red Hat Enterprise Linux 7
  • sfc module

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content