haldaemon fails to start on system with a large number of disks in RHEL 5 and RHEL 6
- Red Hat Enterprise Linux 5
- Red Hat Enterprise Linux 6
On server boot or when running
haldaemonvia initscript -
haldfails to start:
# /etc/init.d/haldaemon start Starting HAL daemon: FAILED
When running in the foreground, starting
# hald --use-syslog --verbose=yes --daemon=no
- The haldaemon service takes a long time at startup and eventually fails to start, but running
hald --daemon=nomanually works.
- Upgrade to
Then create the file
/etc/sysconfig/haldaemonand edit it by adding the following command line argument for hald:
- Please tweak the timeout value in accordance with the maximum time the child process takes to probe all the LUNs existing on you system.
halddaemon is timing out waiting for the child process to probe all the devices. By default,
haldwaits for 250 seconds (4 minutes, 10 seconds) for its child process to complete device probing.
- The issue seems to occur most frequently on systems with a large number of disks.
- Determine how long it takes for hald to fail to start. You can do this by
service haldaemon restartand then timing how long
haldruns before failure, or
hald --use-syslog --verbose=yes
- and then examining the time stamps in the system log to determine when hald started and when it emitted its last message before exiting.
- Red Hat Enterprise Linux
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
I just saw this issue on RHEL 5.9 with hal 0.5.8.1-64.el5 installed already.
It does in fact have a high number of disks/luns.
Creating the /etc/sysconfig file with contents suggested here resolved the issue.
Great to hear. Thanks Jeffrey.
If you have to add a timeout because you have too many devices, then is there even a point to keep it enabled? I have my timeout set to 900 before it quits. I'm thinking that I should just disable haldaemon. Without the timeout, it takes one hour to start up.
The point in this article is to tell you how to INCREASE the timeout from the default to PREVENT the timeout.
On my system that has 410 multipath devices with two /dev/sd* paths each not to mention many logical volumes in LVM, tape drives, etc... setting the value to 600 (10 minutes) gave it plenty of time to start everything.
Having said that on looking at whether haldaemon really needs to be running I found various links suggesting it does NOT on servers as its main benefit is for X.
One such link that talks about other default services:
"man hald" doesn't suggest it is restricted to X so I'm not planning on turning it off on my systems.
Since it CAN timeout and your system will boot anyway it does seem disabling it wouldn't be a problem so it sounds like it is entirely up to you.
Thanks for the link. Yeah, I'm pretty sure nothing we run is using haldaemon. And since we have almost 2,000 LUNs on this machine, haldaemon takes one hour to start.