What could I do to avoid error "Creating images for imaged and diskless nodes Failed" during HPC installation?
Environment
-
Red Hat Enterprise Linux 5 Update 6
-
HPC Solution 5.5
Issue
- Customer is trying to install HPC Solution and during pcm-setup execution the following error appears:
Refreshing repo: rhel5_x86_64. This may take a while... Setting up httpd: [ OK ] Setting up dhcpd: [ OK ] Generating hosts, hosts.equiv, and resolv.conf: [ OK ] Setting up motd: [ OK ] Setting up named: [ OK ] Setting up shared home nfs export: [ OK ] Setting up ntpd: [ OK ] Setting up SSH public keys: [ OK ] Setting up SSH host file: [ OK ] Setting up user skel files: [ OK ] Setting up xinetd: [ OK ] Setting yum repos: [ OK ] Creating images for imaged and diskless nodes This will take some time. Please wait: [FAILED] Setting up rsyslog on PCM master: [ OK ] Setting up Ntop: [ OK ] Setting up CFM: [ OK ] Setting up default Firefox homepage: [ OK ] Setting up fstab for home directories: [ OK ] Synchronizing System configuration files: [ OK ] Setting appglobals variables: [ OK ]
- After that, the image needed to install compute nodes is not available and is impossible to install the nodes.
Resolution
-
check in /depot/repos/1000/ for the file ks.cfg.192.168.1.3 and verify its permissions. Here is from a head node example:
# ls -l /depot/repos/1000/ total 316 drwxr-xr-x 3 root root 4096 Apr 14 10:55 Cluster drwxr-xr-x 3 root root 4096 Apr 14 10:55 ClusterStorage drwxr-xr-x 2 root root 4096 Apr 14 10:56 images drwxr-xr-x 2 root root 4096 Apr 14 10:55 isolinux -rw-r--r-- 1 root root 135 Apr 14 10:56 ks.cfg.192.168.0.5 drwxr-xr-x 3 root root 294912 Apr 14 10:56 Server drwxr-xr-x 3 root root 4096 Apr 14 10:55 VT
2. httpd on the head node should be serving this ks file out. Consider /var/www/html:
# ls -l /var/www/html/
total 20lrwxrwxrwx 1 root root 13 Mar 2 17:10 cfm -> /opt/kusu/cfm
lrwxrwxrwx 1 root root 13 Mar 2 17:10 images -> /depot/images
-rw-r--r-- 1 apache apache 119 Jun 11 2010 index.html
lrwxrwxrwx 1 root root 15 Mar 2 17:10 kits -> /depot/www/kits
drwxr-xr-x 4 root root 4096 Mar 2 15:15 portal
-rw-r--r-- 1 root root 419 Mar 2 17:10 public_keys
lrwxrwxrwx 1 root root 12 Mar 2 17:10 repos -> /depot/repos
3. When the client makes this request we should see it logged in the httpd logs on the head node:
# grep -i cfg /var/log/httpd/*
/var/log/httpd/access_log:192.168.0.6 - - [02/May/2011:15:29:56 -0400] "GET /repos/1000/ks.cfg.192.168.0.5 HTTP/1.0" 200 135 "-" "anaconda/11.1.2.224"
If you find that permissions/directory structure on the head node is very broken or non-existent then its possible that the pcm-setup that was run on this head was run multiple times or run with errors.
4. Make sure you did not change any boot parameters in grub.conf, like "quite" in kernel line, on the head node
5. Make sure that Kdump is disabled during head node installation process
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments