haldaemon fails to start on system with a large number of disks in RHEL 5 and RHEL 6
Environment
- Red Hat Enterprise Linux 5
- Red Hat Enterprise Linux 6
Issue
-
On server boot or when running
haldaemon
via initscript -hald
fails to start:# /etc/init.d/haldaemon start Starting HAL daemon: FAILED
-
When running in the foreground, starting
hald
is successful:# hald --use-syslog --verbose=yes --daemon=no
- The haldaemon service takes a long time at startup and eventually fails to start, but running
hald --daemon=no
manually works.
Resolution
- Upgrade to
hal-0.5.8.1-62.el5
or later. -
Then create the file
/etc/sysconfig/haldaemon
and edit it by adding the following command line argument for hald:--child-timeout=600
- Please tweak the timeout value in accordance with the maximum time the child process takes to probe all the LUNs existing on you system.
Root Cause
- The
hald
daemon is timing out waiting for the child process to probe all the devices. By default,hald
waits for 250 seconds (4 minutes, 10 seconds) for its child process to complete device probing. - The issue seems to occur most frequently on systems with a large number of disks.
Diagnostic Steps
- Determine how long it takes for hald to fail to start. You can do this by
- Running
service haldaemon restart
and then timing how longhald
runs before failure, or - Running
- Running
hald --use-syslog --verbose=yes
- and then examining the time stamps in the system log to determine when hald started and when it emitted its last message before exiting.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments