Accessing the watchdog device /dev/watchdog0 via the kfod.bin utility triggers a system reset

Solution Verified - Updated -

Issue

  • HPE ProLiant systems with Oracle Grid Infrastructure and Oracle AHF installed are experiencing unexpected restarts with the following messages:
[105351.966765] watchdog: watchdog0: watchdog did not stop!
[105351.966841] watchdog: watchdog0: watchdog did not stop!
[105373.209527] Kernel panic - not syncing: 06: An NMI occurred. Depending on your system the reason for the NMI is logged in any one of the following resources:
[105373.209531] 1. Integrated Management Log (IML)
[105373.209532] 2. OA Syslog
[105373.209533] 3. OA Forward Progress Log
[105373.209533] 4. iLO Event Log
[105373.209534] CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Not tainted 4.18.0-553.22.1.el8_10.x86_64 #1
[105373.209535] Hardware name: HP ProLiant DL580 Gen9/ProLiant DL580 Gen9, BIOS U17 08/29/2024
[105373.209536] Call Trace:
[105373.209537]  <NMI>
[105373.209538]  dump_stack+0x41/0x60
[105373.209538]  panic+0xe7/0x2ac
[105373.209539]  ? __ghes_peek_estatus.isra.13+0x49/0xa0
[105373.209539]  nmi_panic.cold.11+0xc/0xc
[105373.209539]  hpwdt_pretimeout+0x7f/0xc6 [hpwdt]
[105373.209540]  nmi_handle+0x63/0x110
[105373.209540]  io_check_error+0x16/0x30
[105373.209540]  default_do_nmi+0x97/0x110
[105373.209541]  do_nmi+0x19c/0x210
[105373.209541]  end_repeat_nmi+0x16/0x69
[105373.209541] RIP: 0010:intel_idle+0xa4/0x120
[105373.209542] Code: 40 dc 01 00 48 89 d1 0f 01 c8 48 8b 00 a8 08 75 19 0f 1f 44 00 00 0f 00 2d c5 e4 67 00 c1 eb 18 b9 01 00 00 00 89 d8 0f 01 c9 <40> 84 f6 74 18 48 89 fa b9 48 00 00 00 89 f8 48 c1 ea 20 0f 30 65
[105373.209543] RSP: 0018:ffffffff88003e20 EFLAGS: 00000002
[105373.209544] RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000001
[105373.209544] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffe0057f401160
[105373.209545] RBP: 0000000000000004 R08: 0000000000000002 R09: 0000000000033000
[105373.209545] R10: 0012020d2c86f8a8 R11: ffff9be3ff432484 R12: 0000000000000004
[105373.209545] R13: ffffffff884c69a0 R14: 0000000000000004 R15: 0000000000000004
  • Dell systems with Oracle Grid Infrastructure and Oracle AHF installed are experiencing abrupt restarts with the following messages:
[105351.966765] watchdog: watchdog0: watchdog did not stop!
[105351.966841] watchdog: watchdog0: watchdog did not stop!

Environment

  • Red Hat Enterprise Linux 7
  • Red Hat Enterprise Linux 8
  • Red Hat Enterprise Linux 9
  • HP ProLiant system with the [hpwdt] module loaded
  • Dell PowerEdge systems with the [iTCO_wdt] module loaded
  • Other systems with the [iTCO_wdt] module loaded
  • Oracle Autonomous Health Framework (AHF)
  • Kernel Filesystem Oracle Disk (kfod.bin) utility

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content