[RHEL 4.6] The system does not halt after the system is gathering diskdump
Environment
-
Red Hat Enterprise Linux 4 update 6
-
kernel versions: 2.6.9-67.EL and 2.6.9-89.0.26.EL
- diskdumputils-1.4.1-2
Issue
If the watchdog timer is set by the ipmi_watchdog
module, although kernel.panic=0 is specified in /etc/sysctl.conf
to make the system
halt after diskdump completes, the system does not shutdown. Instead, it will be rebooted 255 seconds later.
Resolution
diskdump
is working as designed (ie. NOTABUG). impi_watchdog
is also working as designed. Therefore, this issue does not require a fix.
Root Cause
In diskdump.c
, start_disk_dump()
calls notifier_call_chain(&panic_notifier_list)
after it's finished dumping but before it gets to it's halt loop:
static void start_disk_dump(struct pt_regs *regs)
{
... [ snip ] ...
platform_start_crashdump(diskdump_stack, disk_dump, regs);
... [ snip ] ...
notifier_call_chain(&panic_notifier_list, 0, NULL);
... [ snip ] ...
for (;;) {
touch_nmi_watchdog();
machine_halt();
diskdump_mdelay(1000);
}
}
In ipmi_watchdog.c
, wdog_panic_handler()
checks that the watchdog_user
has been set, which is done when the watchdog is registered, and that the panic hasn't been handled already, and then will proced to reset the timeout to 255 seconds and reboot thereafter:
static int wdog_panic_handler(struct notifier_block *this,
unsigned long event,
void *unused)
{
static int panic_event_handled = 0;
/* On a panic, if we have a panic timeout, make sure that the thing
reboots, even if it hangs during that panic. */
if (watchdog_user && !panic_event_handled) {
/* Make sure the panic doesn't hang, and make sure we
do this only once. */
panic_event_handled = 1;
timeout = 255;
pretimeout = 0;
ipmi_watchdog_state = WDOG_TIMEOUT_RESET;
panic_halt_ipmi_set_timeout();
}
return NOTIFY_OK;
}
Diagnostic Steps
Steps to reproduce:
- Set up diskdump.
- Set
kernel.panic
to 0 in/etc/sysctl.conf
. - Execute
set.sh
in the uncompressed "panic" directory to build a kernel module,panic.ko
which causes a
panic. - Reboot the system.
- Execute the following commands:
# /sbin/modprobe ipmi_si type=kcs ports=0xd80 trydefaults=0 # /sbin/modprobe ipmi_devintf # /sbin/modprobe ipmi_watchdog timeout=1800 start_now=1
- Execute the following commands(please use
panic.tar.gz
)
# insmod panic.ko.
- Wait for a while (about 5 minutes or so) after
diskdump
finishes.
Attachments
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments