Chapter 13. Troubleshooting Installation

This chapter covers some of the more common installation and migration issues that are encountered when installing Certificate System.

IMPORTANT

For instance creation errors (related to running pkicreate), try the /var/lib/instance_name/logs/instance-install.log file. If you have configuration errors (meaning, setting up the subsystem after running pkicreate), check the catalina.out and debug files in the instance's log directory.
Q: I can't see any Certificate System packages or updates.
Q: The init script returned an OK status, but my CA instance does not respond. Why?
Q: I can't open the pkiconsole and I'm seeing Java exceptions in stdout.
Q: I tried to run pkiconsole, and I got Socket exceptions in stdout. Why?
Q: I am having trouble running through the configuration wizard for a new instance. It's giving me errors about already having the certificate for the subsystem installed.
Q: I want to set different certificate validity periods and extensions for my root certificate authority — but I don't see a way to set it in the configuration wizard.
Q: I'm seeing an HTTP 500 error code when I try to connect to the web services pages after configuring my subsystem instance.
Q: I keep getting errors in when I try to configure the LDAP internal database for my instance. It says the database has already been used. Why?
Q: I'm seeing an authentication error when I try to run the migration tools to upgrade my instance.
Q: I ran the Old_versionToTxt script and then TxtTo73, and I'm seeing AUTH_TOKEN errors in my debug log.
Q:
I can't see any Certificate System packages or updates.
A:
Certificate System packages are delivered through Red Hat Network, so you need to have your yum repositories configured to point to Red Hat Network or a local Satellite and then to use and account with the appropriate subscriptions to access the Red Hat Certificate System channels.
Q:
The init script returned an OK status, but my CA instance does not respond. Why?
A:
This should not happen. Usually (but not always), this indicates a listener problem with the CA, but it can have many different causes. Check in the catalina.out, system, and debug log files for the instance to see what errors have occurred. This lists a couple of common errors.
One situation is when there is a PID for the CA, indicating the process is running, but that no listeners have been opened for the server. This would return Java invocation class errors in the catalina.out file:
Oct 29, 2010 4:15:44 PM org.apache.coyote.http11.Http11Protocol init
INFO: Initializing Coyote HTTP/1.1 on http-9080
java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:615)
        at org.apache.catalina.startup.Bootstrap.load(Bootstrap.java:243)
        at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:408)
Caused by: java.lang.UnsatisfiedLinkError: jss4
This could mean that you have the wrong version of JSS or NSS. The process requires libnss3.so in the path. Check this with this command:
ldd /usr/lib64/libjss4.so
If libnss3.so is not found, try unsetting the LD_LIBRARY_PATH variable and restart the CA.
unset LD_LIBRARY_PATH
service pki-ca restart
Q:
I can't open the pkiconsole and I'm seeing Java exceptions in stdout.
A:
This probably means that you have the wrong JRE installed or the wrong JRE set as the default. Run alternatives --config java to see what JRE is selected. Red Hat Certificate System requires OpenJDK 1.6.
Q:
I tried to run pkiconsole, and I got Socket exceptions in stdout. Why?
A:
This means that there is a port problem. Either there are incorrect SSL settings for the administrative port (meaning there is bad configuration in the server.xml) or the wrong port was given to access the admin interface.
Port errors will look like the following:
NSS Cipher Supported '0xff04'
java.io.IOException: SocketException cannot read on socket
        at org.mozilla.jss.ssl.SSLSocket.read(SSLSocket.java:1006)
        at org.mozilla.jss.ssl.SSLInputStream.read(SSLInputStream.java:70)
        at
com.netscape.admin.certsrv.misc.HttpInputStream.fill(HttpInputStream.java:303)
        at
com.netscape.admin.certsrv.misc.HttpInputStream.readLine(HttpInputStream.java:224)
        at
com.netscape.admin.certsrv.connection.JSSConnection.readHeader(JSSConnection.java:439)
        at
com.netscape.admin.certsrv.connection.JSSConnection.initReadResponse(JSSConnection.java:430)
        at
com.netscape.admin.certsrv.connection.JSSConnection.sendRequest(JSSConnection.java:344)
        at
com.netscape.admin.certsrv.connection.AdminConnection.processRequest(AdminConnection.java:714)
        at
com.netscape.admin.certsrv.connection.AdminConnection.sendRequest(AdminConnection.java:623)
        at
com.netscape.admin.certsrv.connection.AdminConnection.sendRequest(AdminConnection.java:590)
        at
com.netscape.admin.certsrv.connection.AdminConnection.authType(AdminConnection.java:323)
        at
com.netscape.admin.certsrv.CMSServerInfo.getAuthType(CMSServerInfo.java:113)
        at com.netscape.admin.certsrv.CMSAdmin.run(CMSAdmin.java:499)
        at com.netscape.admin.certsrv.CMSAdmin.run(CMSAdmin.java:548)
        at com.netscape.admin.certsrv.Console.main(Console.java:1655)
Q:
I am having trouble running through the configuration wizard for a new instance. It's giving me errors about already having the certificate for the subsystem installed.
A:
This error occurs when the same browser profile is used to configure multiple instances of the same type of subsystem. If a subsystem (or the entire Certificate System) is removed and reinstalled, then the previous CA certificate and subsystem certificates need to be deleted from the browser profile. Otherwise, when the replacement subsystem is created, the CA certificate has the same subject name and serial number as the previous instance, which creates a conflict in the browser NSS databases.
The easiest solution is to create new browser profiles when reinstalling instances of Certificate System.
Q:
I want to set different certificate validity periods and extensions for my root certificate authority — but I don't see a way to set it in the configuration wizard.
A:
There isn't currently a way to do this in the configuration wizard. However, there is a way to edit the certificate profiles used by the configuration wizard to generate the root CA certificates.

IMPORTANT

This must be done before running pkicreate to create a new CA instance.
  1. Back up the original CA certificate profile used by the configuration wizard.
    cp -p /usr/share/pki/ca/conf/caCert.profile /usr/share/pki/ca/conf/caCert.profile.orig
  2. Open the CA certificate profile used by the configuration wizard.
    vim /usr/share/pki/ca/conf/caCert.profile
  3. Reset the validity period in the Validity Default to whatever you want. For example, to change the period to two years:
    2.default.class=com.netscape.cms.profile.def.ValidityDefault
    2.default.name=Validity Default
    2.default.params.range=7200
  4. Add any extensions by creating a new default entry in the profile and adding it to the list. For example, to add the Basic Constraint Extension, add the default (which, in this example, is default #9):
    9.default.class=com.netscape.cms.profile.def.BasicConstraintsExtDefault
    9.default.name=Basic Constraint Extension Constraint
    9.default.params.basicConstraintsCritical=true
    9.default.params.basicConstraintsIsCA=true
    9.default.params.basicConstraintsPathLen=2
    Then, add the default number to the list of defaults to use the new default:
    list=2,4,5,6,7,8,9
  5. Once the new profile is set up, then run pkicreate to create the new CA instance and go through the configuration wizard.
Q:
I'm seeing an HTTP 500 error code when I try to connect to the web services pages after configuring my subsystem instance.
A:
This is an unexpected generif error which can have many different causes. Check in the catalina.out, system, and debug log files for the instance to see what errors have occurred. This lists a couple of common errors, but there are many other possibilities.
Error #1: The LDAP database is not running.
If the Red Hat Directory Server instance use for the internal database is not running, then you cannot connect to the instance. This will be apparent in exceptions in the catalina.out file that the instance is not ready:
java.io.IOException: CS server is not ready to serve.
        com.netscape.cms.servlet.base.CMSServlet.service(CMSServlet.java:409)
        javax.servlet.http.HttpServlet.service(HttpServlet.java:688)
The Tomcat logs will specifically identify the problem with the LDAP connection:
5558.main - [29/Oct/2010:11:13:40 PDT] [8] [3] In Ldap (bound) connection pool
to host ca1 port 389, Cannot connect to LDAP server. Error:
netscape.ldap.LDAPException: failed to connect to server
ldap://ca1.example.com:389 (91)
As will the instance's debug log:
[29/Oct/2010:11:39:10][main]: CMS:Caught EBaseException
Internal Database Error encountered: Could not connect to LDAP server host
ca1 port 389 Error netscape.ldap.LDAPException: failed to connect to
server ldap://ca1:389 (91)
        at com.netscape.cmscore.dbs.DBSubsystem.init(DBSubsystem.java:262)
Error #2: A VPN is blocking access.
Another possibility is that you are connecting to the subsystem over a VPN. The VPN must have a configuration option like Use this connection only for resources on its network enabled. If that option is not enabled, then the catalina.out log file for the instance's Tomcat servive shows a series of connection errors that result in the HTTP 500 error:
May 26, 2010 7:09:48 PM org.apache.catalina.core.StandardWrapperValve invoke
SEVERE: Servlet.service() for servlet services threw exception
java.io.IOException: CS server is not ready to serve.
        at com.netscape.cms.servlet.base.CMSServlet.service(CMSServlet.java:441)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:269)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:210)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
        at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:542)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:151)
        at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:870)
        at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
        at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
        at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
        at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:685)
	at java.lang.Thread.run(Thread.java:636)
Q:
I keep getting errors in when I try to configure the LDAP internal database for my instance. It says the database has already been used. Why?
A:
The configuration process creates a new suffix and database in the Directory Server. Even an instance that was removed still has its database in the Directory Server so its information is preserved. If a new instance is created with the same name, then either the Directory Server suffix in the configuration wizard needs to be given a new, unique name or you need to select the checkbox to overwrite any existing data in the database.
If the Directory Server database for the subsystem instance isn't given a unique name, then configuration will not move past that point and the debug file will have these errors about the database already being used:
[15/Jul/2010:16:28:15][http-7445-Processor25]: DatabasePanel populateDB: creating non-secure (non-SSL) connection for internal ldap
[15/Jul/2010:16:28:15][http-7445-Processor25]: DatabasePanel connecting to 10.14.5.25:389
[15/Jul/2010:16:28:15][http-7445-Processor25]: DatabasePanel update: This database has already been used.
[15/Jul/2010:16:28:15][http-7445-Processor25]: DatabasePanel update: populateDB Exception: java.io.IOException: This database has already been used. Select the checkbox below to remove all data and reuse this database
[15/Jul/2010:16:28:15][http-7445-Processor25]: panel no=7
[15/Jul/2010:16:28:15][http-7445-Processor25]: panel name=database
[15/Jul/2010:16:28:15][http-7445-Processor25]: total number of panels=19
Q:
I'm seeing an authentication error when I try to run the migration tools to upgrade my instance.
A:
The migration tools are out of date. The new migration tools are included with the new Certificate System packages. To upgrade to 8.1, use the upgrade tools supplied with 8.1.
When trying to upgrade by running TxtTo73 instead of TxtTo81, then the new instance cannot import the previous information, which means it cannot recognize and confirm the old certificates. This results in an authentication error for the wrong admin certificate in the debug logs:
16076.http-9443-Processor20 - [29/Jun/2009:16:03:58 PDT] [6] [3] Agent authentication cannot evaluate the revocation status.
16076.http-9443-Processor24 - [29/Jun/2009:16:04:00 PDT] [6] [3] Agent authentication cannot evaluate the revocation status.
16076.http-9443-Processor18 - [29/Jun/2009:16:04:02 PDT] [6] [3] Agent authentication cannot evaluate the revocation status.
16076.http-9443-Processor18 - [29/Jun/2009:16:04:02 PDT] [13] [6] checkPermission(): permission denied for the resource certServer.ca.request.profile on operation read
Trying to enroll the new instance certificates as part of migration would fail with similar authentication errors when the certificate extensions aren't found:
[29/Jun/2009:16:35:42][http-9443-Processor23]: KeyUsageExtDefault: populate start
[29/Jun/2009:16:35:42][http-9443-Processor23]: KeyUsageExtDefault: populate end
[29/Jun/2009:16:35:42][http-9443-Processor23]: ExtendedKeyUsageExtDefault: populate start
[29/Jun/2009:16:35:42][http-9443-Processor23]: ExtendedKeyUsageExtDefault: populate end
[29/Jun/2009:16:35:42][http-9443-Processor23]: SubjectAltNameExtDefault: populate start
[29/Jun/2009:16:35:42][http-9443-Processor23]: SubjectAltNameExtDefault: createExtension i=0
[29/Jun/2009:16:35:42][http-9443-Processor23]: gname is empty, not added
[29/Jun/2009:16:35:42][http-9443-Processor23]: count is 0
[29/Jun/2009:16:35:42][http-9443-Processor23]: ProfileSubmitServlet: populate extension not found
[29/Jun/2009:16:35:42][http-9443-Processor23]: CMSServlet: curDate=Mon Jun 29 16:35:42 PDT 2009 id=caProfileSubmit time=379
Q:
I ran the Old_versionToTxt script and then TxtTo73, and I'm seeing AUTH_TOKEN errors in my debug log.
A:
The migration tools are out of date. The upgrade script go to the new version of Certificate System, meaning that to upgrade to 8.1, you need to run TxtTo81. This script is available with the new Certificate System 8.1 pacakges.
If the wrong script is used to convert the output text file, then you may see errors like this:
ERROR AuthToken type - AUTH_TOKEN:com.netscape.certsrv.authentication.AuthToken=uid:[Ljava.lang.String;=[Ljava.lang.String;@108ca1