Red Hat Training

A Red Hat training course is available for Red Hat Satellite

Chapter 7. Troubleshooting

This chapter provides tips for determining the cause of and resolving the most common errors associated with RHN Satellite. If you need additional help, contact Red Hat Network support at https://rhn.redhat.com/help/contact.pxt. Log in using your Satellite-entitled account to see the full list of options.
To begin troubleshooting general problems, examine the log file or files related to the component exhibiting failures. A useful exercise is to issue the tail -f command for all log files and then run yum list. You should then examine all new log entries for potential clues.
A common issue is full disk space. An almost sure sign of this is the appearance of halted writing in the log files. If logging stopped during a write, such as mid-word, you likely have filled disks. To confirm this, run this command and check the percentages in the Use% column:
df -h
In addition to log files, you can obtain valuable information by retrieving the status of your RHN Satellite and its various components. This can be done with the command:
/usr/sbin/rhn-satellite status
In addition, you can obtain the status of components such as the Apache Web server and the RHN Task Engine individually. For instance, to view the status of the Apache Web server, run the command:
service httpd status
7.1. Installing and Updating
Q: SELinux keeps giving me messages when I'm trying to install. Why?
Q: I changed /var/satellite to an NFS mount, and now SELinux is stopping it working properly. What do I need to do?
Q: My Satellite is failing. Any idea why?
7.2. Services
Q: Why isn't the Apache Web server running?
Q: How do I find out what the status of the RHN Task Engine is?
Q: How do I find out what the status of the Satellite's Embedded Database is?
Q: What do I do if yum, up2date, or the push capability of the RHN Satellite stops working?
7.3. Connectivity
Q: I can't connect! How do I work out what is wrong?
Q: What do I do if importing or synchronizing a channel fails and I can't recover it?
Q: I'm getting "SSL_CONNECT" errors. What do I do now?
7.4. Logging and Reporting
Q: What are the different log files?
Q: How do I use spacewalk-report?
Q: How do I work out what version of the database schema I have?
Q: How do I work out what character set types I have?
Q: Why isn't the administrator getting email?
Q: How do I change the sender of the traceback mail?
7.5. Errors
Q: I'm getting an "Error validating satellite certificate" error during RHN Satellite installation. How do I fix it?
Q: I'm getting an "ERROR: server.mount_point not set in the configuration file" error when I try to activate or synchronize the RHN Satellite. How do I fix it?
Q: Why does cobbler check give an error saying that it needs a different version of yum-utils?
Q: I'm getting a "unsupported version" error when I try to activate the RHN Satellite certificate. How do I fix it?
Q: I'm getting an "Internal Server Error" complaining about ASCII when I try to edit the kickstart profile. What's going on?
Q: I'm getting "Host Not Found" or "Could Not Determine FQDN" errors. What do I do now?
Q: I'm getting a "This server is not an entitled Satellite" when I try to synchronize the RHN Satellite server. How do fix it?

7.1. Installing and Updating

Q:
SELinux keeps giving me messages when I'm trying to install. Why?
A:
If you encounter any issues with SELinux messages (such as AVC denial messages) while installing RHN Satellite, be sure to have the audit.log files available so that Red Hat Support personnel can assist you. You can find the file in /var/log/audit/audit.log and can attach the file to your Support ticket for engineers to assist you.
Q:
I changed /var/satellite to an NFS mount, and now SELinux is stopping it working properly. What do I need to do?
A:
You will need to tell SELinux about the NFS mount in order for it to allow that traffic. You can do this with the command:
# /usr/sbin/setsebool -P spacewalk_nfs_mountpoint on
If you are using Red Hat Enterprise Linux 6, you will also need to run the command:
# /usr/sbin/setsebool -P cobbler_use_nfs on
Q:
My Satellite is failing. Any idea why?
A:
Do not subscribe your RHN Satellite to any of the following child channels available from RHN's central servers:
  • Red Hat Developer Suite
  • Red Hat Application Server
  • Red Hat Extras
Subscribing to these channels and updating your Satellite may install newer, incompatible versions of critical software components, causing the Satellite to fail.

7.2. Services

Q:
Why isn't the Apache Web server running?
A:
If the Apache Web server isn't running, entries in your /etc/hosts file may be incorrect.
Q:
How do I find out what the status of the RHN Task Engine is?
A:
To obtain the status of the RHN Task Engine, run the command:
service taskomatic status
Q:
How do I find out what the status of the Satellite's Embedded Database is?
A:
To view the status of the Satellite's Embedded Database, if it exists, run the command:
service oracle status
Q:
What do I do if yum, up2date, or the push capability of the RHN Satellite stops working?
A:
If yum, up2date, or the push capability of the RHN Satellite ceases to function, it is possible that old log files may be at fault. Stop the jabberd daemon before removing these files. To do so, issue the following commands as root:
service jabberd stop
cd /var/lib/jabberd
rm -f _db*
service jabberd start

7.3. Connectivity

Q:
I can't connect! How do I work out what is wrong?
A:
The following measures can be used to troubleshoot general connection errors:
  • Attempt to connect to the RHN Satellite's database at the command line using the correct connection string as found in /etc/rhn/rhn.conf:
    sqlplus username/password@sid
  • Ensure the RHN Satellite is using Network Time Protocol (NTP) and set to the appropriate time zone. This also applies to all client systems and the separate database machine in RHN Satellite with Stand-Alone Database.
  • Confirm the correct package:
    7 rhn-org-httpd-ssl-key-pair-MACHINE_NAME-VER-REL.noarch.rpm 
    is installed on the RHN Satellite and the corresponding rhn-org-trusted-ssl-cert-*.noarch.rpm or raw CA SSL public (client) certificate is installed on all client systems.
  • Verify the client systems are configured to use the appropriate certificate.
  • If also using one or more RHN Proxy Servers, ensure each Proxy's SSL certificates are prepared correctly. The Proxy should have both its own server SSL key-pair and CA SSL public (client) certificate installed, since it will serve in both capacities. Refer to the SSL Certificates chapter of the RHN Client Configuration Guide for specific instructions.
  • Make sure client systems are not using firewalls of their own, blocking required ports as identified in Section 2.4, “Additional Requirements”.
Q:
What do I do if importing or synchronizing a channel fails and I can't recover it?
A:
If importing/synchronizing a channel fails and you can't recover it in any other way, run this command to delete the cache:
rm -rf temporary-directory
Note that Section 6.2.2.1, “Preparing Channel Content ISOs” suggested that this temporary directory be /var/rhn-sat-import/.
Next, restart the importation or synchronization.
Q:
I'm getting "SSL_CONNECT" errors. What do I do now?
A:
A common connection problem, indicated by SSL_CONNECT errors, is the result of a Satellite being installed on a machine whose time had been improperly set. During the Satellite installation process, SSL certificates are created with inaccurate times. If the Satellite's time is then corrected, the certificate start date and time may be set in the future, making it invalid.
To troubleshoot this, check the date and time on the clients and the Satellite with the following command:
date
The results should be nearly identical for all machines and within the "notBefore" and "notAfter" validity windows of the certificates. Check the client certificate dates and times with the following command:
openssl x509 -dates -noout -in /usr/share/rhn/RHN-ORG-TRUSTED-SSL-CERT
Check the Satellite server certificate dates and times with the following command:
openssl x509 -dates -noout -in /etc/httpd/conf/ssl.crt/server.crt
By default, the server certificate has a one-year life while client certificates are good for 10 years. If you find the certificates are incorrect, you can either wait for the valid start time, if possible, or create new certificates, preferably with all system times set to GMT.

7.4. Logging and Reporting

Q:
What are the different log files?
A:
Virtually every troubleshooting step should start with a look at the associated log file or files. These provide invaluable information about the activity that has taken place on the device or within the application that can be used to monitor performance and ensure proper configuration. See Table 7.1, “Log Files” for the paths to all relevant log files:
There may be numbered log files (such as /var/log/rhn/rhn_satellite_install.log.1, /var/log/rhn/rhn_satellite_install.log.2, etc.) within the /var/log/rhn/ directory. These are rotated logs, which are log files created with a .<NUMBER> extension when the current rhn_satellite_install.log file fills up to a size as specified by the logrotate(8) daemon and the contents written to a rotated log file. For example, the rhn_satellite_install.log.1 contains the oldest rotated log file, while rhn_satellite_install.log.4 contains the most recently rotated log.

Table 7.1. Log Files

Component/Task Log File Location
Apache Web server /var/log/httpd/ directory
RHN Satellite /var/log/rhn/ directory
RHN Satellite Installation Program /var/log/rhn/rhn_satellite_install.log
Database installation - Embedded Database /var/log/rhn/install_db.log
Database population /var/log/rhn/populate_db.log
RHN Satellite Synchronization Tool /var/log/rhn/rhn_server_satellite.log
Monitoring infrastructure /var/log/nocpulse/ directory
Monitoring notifications /var/log/notification/ directory
RHN DB Control - Embedded Database /var/log/rhn/rhn_database.log
RHN Task Engine (taskomatic) /var/log/messages
yum /var/log/yum.log
XML-RPC transactions /var/log/rhn/rhn_server_xmlrpc.log
Q:
How do I use spacewalk-report?
A:
There are instances where administrators may need a concise, formatted summary of their RHN Satellite resources, whether it is to take inventory of their entitlements, subscribed systems, or users and organizations. Rather than gathering such information manually from the Satellite Web interface, RHN Satellite 5.4 includes the spacewalk-report command to gather and display vital Satellite information at once.

Note

To use spacewalk-report you must have the spacewalk-reports package installed.
spacewalk-report allows administrators to organize and display reports about content, errata, systems, system event history, and user resources across the Satellite. The spacewalk-report command is used to generate reports on:
  • System Inventory — Lists all of the systems registered to the Satellite.
  • Entitlements — Lists all organizations on the Satellite, sorted by system or channel entitlements.
  • Errata — Lists all the errata relevant to the registered systems, sorts errata by severity as well as the systems that apply to a particular erratum.
  • Users — Lists all the users registered to the Satellite, and lists any systems associated with a particular user.
  • System History — Lists all, or a subset, of the system events that have occurred.
To get a report in CSV format, run the following at the command prompt of your Satellite server.
spacewalk-report report_name
The following reports are available:

Table 7.2. spacewalk-report Reports

Report Invoked as Description
System Inventory inventory List of systems registered to the server, together with hardware and software information
Entitlements entitlements Lists all organizations on the Satellite with their system or channel entitlements
Errata in channels errata-channels Lists errata in channels
All Errata errata-list-all Complete list of all errata
Errata for systems errata-systems Lists applicable errata and any registered systems that are affected
Users in the system users Lists all users registered to the Satellite
Systems administered users-systems Lists systems that can be administered by individual users
Kickstart Trees kickstartable-trees Lists trees able to be kickstarted
System history system-history Lists system event history
System history channels system-history-channels Lists system event history
System history configuration system-history-configuration Lists system configuration event history
System history entitlements system-history-entitlements Lists system entitlement event history
System history errata system-history-errata Lists system errata event history
System history kickstart system-history-kickstart Lists system kickstart and provisioning event history
System history packages system-history-packages Lists system package event history
For more information about an individual report, run spacewalk-report with the --info or --list-fields-info and the report name. The description and list of possible fields in the report will be shown.
For further information, the spacewalk-report(8) manpage as well as the --help parameter of the spacewalk-report program can be used to get additional information about the program invocations and their options.
Q:
How do I work out what version of the database schema I have?
A:
To determine the version of your database schema, run the command:
rhn-schema-version
Q:
How do I work out what character set types I have?
A:
To derive the character set types of your Satellite's database, run the command:
rhn-charsets
Q:
Why isn't the administrator getting email?
A:
If the administrator is not getting email from the RHN Satellite, confirm the correct email addresses have been set for traceback_mail in /etc/rhn/rhn.conf.
Q:
How do I change the sender of the traceback mail?
A:
If the traceback mail is marked from dev-null@rhn.redhat.com and you would like the address to be valid for your organization, include the web.default_mail_from option and appropriate value in /etc/rhn/rhn.conf.

7.5. Errors

Q:
I'm getting an "Error validating satellite certificate" error during RHN Satellite installation. How do I fix it?
A:
An "Error validating satellite certificate" error during RHN Satellite installation is caused by having an HTTP proxy in the environment. This can be confirmed by looking at the install.log file, and locating the following error:
ERROR: unhandled exception occurred: 
Traceback (most recent call last): 
  File "/usr/bin/rhn-satellite-activate", line 45, in ? 
    sys.exit(abs(mod.main() or 0)) 
  File "/usr/share/rhn/satellite_tools/rhn_satellite_activate.py", line 585, in main 
    activateSatellite_remote(options) 
  File "/usr/share/rhn/satellite_tools/rhn_satellite_activate.py", line 291, in activateSatellite_remote 
    ret = s.satellite.deactivate_satellite(systemid, rhn_cert) 
  File "/usr/lib/python2.4/site-packages/rhn/rpclib.py", line 603, in __call__ 
    return self._send(self._name, args) 
  File "/usr/lib/python2.4/site-packages/rhn/rpclib.py", line 326, in _request 
    self._handler, request, verbose=self._verbose) 
  File "/usr/lib/python2.4/site-packages/rhn/transports.py", line 171, in request 
    headers, fd = req.send_http(host, handler) 
  File "/usr/lib/python2.4/site-packages/rhn/transports.py", line 698, in send_http 
    self._connection.connect() 
  File "/usr/lib/python2.4/site-packages/rhn/connections.py", line 193, in connect 
    sock.connect((self.host, self.port)) 
  File "<string>", line 1, in connect 
socket.timeout: timed out
To resolve the issue:
  1. Run the install script in disconnected mode, and skip the database installation which has already been done:
    ./install.pl --disconnected --skip-db-install
    
  2. Open /etc/rhn/rhn.conf with your preferred text editor, and add or modify the following line:
    server.satellite.rhn_parent = satellite.rhn.redhat.com
    
    Remove the following line:
    disconnected=1
    
    If you are using a proxy for the connection to Red Hat Network, you will also need to add or modify the following lines to reflect the proxy settings.
    server.satellite.http_proxy = <hostname>:<port>
    server.satellite.http_proxy_username = <username>
    server.satellite.http_proxy_password = <password>
    
  3. Re-activate the Satellite in connected mode, using the rhn-satellite-activate command as the root user, including the path and filename of the satellite certificate:
    # rhn-satellite-activate --rhn-cert=/path/to/file.cert
Alternatively, try running the install.pl script in connected mode, but with the --answer-file=answer file option. Ensure the answer file has the HTTP proxy information specified as follows:
rhn-http-proxy = <hostname>:<port>
rhn-http-proxy-username = <username>
rhn-http-proxy-password = <password>
Q:
I'm getting an "ERROR: server.mount_point not set in the configuration file" error when I try to activate or synchronize the RHN Satellite. How do I fix it?
A:
An "ERROR: server.mount_point not set in the configuration file" error during RHN Satellite activation or synchronization can occur if the mount_point configuration parameter in /etc/rhn/rhn.conf does not point to a directory path, or the directory path it points to is not present or does not have permission to access the directory.
To resolve the issue, check the value of the mount_point configuration parameter in /etc/rhn/rhn.conf. If it set to the default value of /var/satellite, verify that the /var/satellite and /var/satellite/redhat directories exist. For all values, check that path to the file is accurate, and that the permissions are set correctly.
Q:
Why does cobbler check give an error saying that it needs a different version of yum-utils?
A:
Sometimes, running the cobbler check command can give an error similar to the following:
cobbler check 
The following potential problems were detected: 
#0: yum-utils need to be at least version 1.1.17 for reposync -l, current version is 1.1.16
This is a known issue in Cobbler's reposync package. The error is spurious and can be safely ignored. This error will be resolved in future versions of RHN Satellite.
Q:
I'm getting a "unsupported version" error when I try to activate the RHN Satellite certificate. How do I fix it?
A:
If your RHN Satellite certificate has become corrupted, you could get one of the following errors:
ERROR: <Fault -2: 'unhandled internal exception: unsupported version: 96'>
RHN_PARENT: satellite.rhn.redhat.com
     Error reported from RHN: <Fault -2: 'unhandled internal exception: unsupported version: 115'>
     ERROR: unhandled XMLRPC fault upon remote activation: <Fault -2: 'unhandled internal exception: unsupported version: 115'>
     ERROR: <Fault -2: 'unhandled internal exception: unsupported version: 115'>
Invalid satellite certificate
To resolve this issue, contact Red Hat support services for a new certificate.
Q:
I'm getting an "Internal Server Error" complaining about ASCII when I try to edit the kickstart profile. What's going on?
A:
If you have recently added some kernel parameters to your kickstart profile, you might find that when you attempt to View a List of Kickstart Profiles that you get the following Internal Server Error:
'ascii' codec can't encode character u'\u2013'
This error occurs because some text in the profile is not being recognised correctly.
To resolve the issue:
  1. Ssh directly onto the Satellite server as the root user:
    ssh root@satellite.fqdn.com
    
  2. Find the kickstart profile that is causing the problem by looking at the dates of the files in /var/lib/cobbler/config/profiles.d and locating the one that was edited most recently:
    ls -l /var/lib/cobbler/config/profiles.d/
    
  3. Open the profile in your preferred text editor, and locate the following text:
    \u2013hostname
    
    Change the entry to read:
    --hostname
    
  4. Save changes to the profile and close the file.
  5. Restart the RHN Satellite services to pick up the updated profile:
    rhn-satellite restart
    Shutting down rhn-satellite...
    Stopping RHN Taskomatic...
    Stopped RHN Taskomatic.
    Stopping cobbler daemon:                                   [  OK  ]
    Stopping rhn-search...
    Stopped rhn-search.
    Stopping MonitoringScout ...                               [  OK  ]
    Stopping Monitoring ...                                    [  OK  ]
    Stopping httpd:                                            [  OK  ]
    Stopping tomcat5:                                          [  OK  ]
    Shutting down osa-dispatcher:                              [  OK  ]
    Shutting down Oracle Net Listener ...                      [  OK  ]
    Shutting down Oracle DB instance "rhnsat" ...              [  OK  ]
    Shutting down Jabber router:                               [  OK  ]
    Done.
    Starting rhn-satellite...
    Starting Jabber services                                   [  OK  ]
    Starting Oracle Net Listener ...                           [  OK  ]
    Starting Oracle DB instance "rhnsat" ...                   [  OK  ]
    Starting osa-dispatcher:                                   [  OK  ]
    Starting tomcat5:                                          [  OK  ]
    Starting httpd:                                            [  OK  ]
    Starting Monitoring ...                                    [  OK  ]
    Starting MonitoringScout ...                               [  OK  ]
    Starting rhn-search...
    Starting cobbler daemon:                                   [  OK  ]
    Starting RHN Taskomatic...
    Done.
    
  6. Return to the web interface. Note that interface can take some time to resolve the services, but should return to normal after a minute or so.
Q:
I'm getting "Host Not Found" or "Could Not Determine FQDN" errors. What do I do now?
A:
Because RHN configuration files rely exclusively on fully qualified domain names (FQDNs), it is imperative that key applications are able to resolve the name of the RHN Satellite into an IP address. Red Hat Update Agent, Red Hat Network Registration Client, and the Apache Web server are particularly prone to this problem with the RHN applications issuing errors of "host not found" and the Web server stating "Could not determine the server's fully qualified domain name" upon failing to start.
This problem typically originates from the /etc/hosts file. You may confirm this by examining /etc/nsswitch.conf, which defines the methods and the order by which domain names are resolved. Usually, the /etc/hosts file is checked first, followed by Network Information Service (NIS) if used, followed by DNS. One of these has to succeed for the Apache Web server to start and the RHN client applications to work.
To resolve this problem, identify the contents of the /etc/hosts file. It may look like this:
127.0.0.1 this_machine.example.com this_machine localhost.localdomain \ localhost
First, in a text editor, remove the offending machine information, like so:
127.0.0.1 localhost.localdomain.com localhost
Then, save the file and attempt to re-run the RHN client applications or the Apache Web server. If they still fail, explicitly identify the IP address of the Satellite in the file, such as:
127.0.0.1 localhost.localdomain.com localhost
123.45.67.8 this_machine.example.com this_machine
Replace the value here with the actual IP address of the Satellite. This should resolve the problem. Keep in mind, if the specific IP address is stipulated, the file will need to be updated when the machine obtains a new address.
Q:
I'm getting a "This server is not an entitled Satellite" when I try to synchronize the RHN Satellite server. How do fix it?
A:
If satellite-sync reports that the server is not activated as an RHN Satellite, it isn't subscribed to the respective RHN Satellite channel. If this is a newly installed system then the satellite certificate is possibly not activated on the system. If it was activited earlier, then it has become deactivated.
Check the system's child channels to discover if it is subscribed to any Red Hat Network RHN Satellite channel:
  • Login to Red Hat Network and search the system's child channel, using one of these methods:
  • On a Red Hat Enterprise Linux 5 or 6 system, view the channels to which the system is subscribed with this command:
    yum repolist
Activate the same Satellite certificate again on your Satellite, using this command as the root user:
rhn-satellite-activate -vvv --rhn-cert=/path/to/certificate

Note

If you've exhausted these troubleshooting steps or want to defer them to Red Hat Network professionals, Red Hat recommends you take advantage of the strong support that comes with RHN Satellite. The most efficient way to do this is to aggregate your Satellite's configuration parameters, log files, and database information and send this package directly to Red Hat.
RHN provides a command line tool explicitly for this purpose: The Satellite Diagnostic Info Gatherer, commonly known by its command satellite-debug. To use this tool, issue the command as root. You will see the pieces of information collected and the single tarball created, like so:
[root@miab root]# satellite-debug
Collecting and packaging relevant diagnostic information.
Warning: this may take some time...
    * copying configuration information
    * copying logs
    * querying RPM database (versioning of RHN Satellite, etc.)
    * querying schema version and database character sets
    * get diskspace available
    * timestamping
    * creating tarball (may take some time): /tmp/satellite-debug.tar.bz2
    * removing temporary debug tree
 
Debug dump created, stored in /tmp/satellite-debug.tar.bz2
Deliver the generated tarball to your RHN contact or support channel.
Once finished, email the new file from the /tmp/ directory to your Red Hat representative for immediate diagnosis.