haldaemon fails to start on system with a large number of disks in RHEL 5 and RHEL 6
Environment
- Red Hat Enterprise Linux 5
- Red Hat Enterprise Linux 6
Issue
-
On server boot or when running
haldaemonvia initscript -haldfails to start:# /etc/init.d/haldaemon start Starting HAL daemon: FAILED -
When running in the foreground, starting
haldis successful:# hald --use-syslog --verbose=yes --daemon=no - The haldaemon service takes a long time at startup and eventually fails to start, but running
hald --daemon=nomanually works.
Resolution
- Upgrade to
hal-0.5.8.1-62.el5or later. -
Then create the file
/etc/sysconfig/haldaemonand edit it by adding the following command line argument for hald:--child-timeout=600 - Please tweak the timeout value in accordance with the maximum time the child process takes to probe all the LUNs existing on you system.
Root Cause
- The
halddaemon is timing out waiting for the child process to probe all the devices. By default,haldwaits for 250 seconds (4 minutes, 10 seconds) for its child process to complete device probing. - The issue seems to occur most frequently on systems with a large number of disks.
Diagnostic Steps
- Determine how long it takes for hald to fail to start. You can do this by
- Running
service haldaemon restartand then timing how longhaldruns before failure, or - Running
- Running
hald --use-syslog --verbose=yes
- and then examining the time stamps in the system log to determine when hald started and when it emitted its last message before exiting.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments