How to debug DB connection issues in RHEV-M environment

Solution Verified - Updated -

Environment

Red Hat Enterprise Virtualization Manager (RHEV-M) - 3.1 - 3.6

Issue

  • In certain cases RHEV-M engine, reporting services and notification service cannot reach database. Instead of performing tasks exception messages are logged in the log of the specific service.

  • Purpose of this article is to provide you with steps necessary for debugging these connection issues on the side of JBoss application server.

  • How to generate jboss thread dump?

Resolution

  • Obtain a thread stack trace of the Java Virtual Machine at the time of the issue, use below command to get the same:

    # kill -3 $(cat /var/run/ovirt-engine-pid)
    
  • The results are captured in /var/log/ovirt-engine/console.log file. The stack traces in it will help to check if there is any thread that is blocked trying to do some database operation, thus holding on a connection.

  • Activate the cached connection manager in the application server, which will generate some additional debug information if there happen to be leaked connections.

    • To do this edit /usr/share/ovirt-engine/service/engine-service.xml file
    • And add user-ccm="true" to the datasource definition (line 99):

      <datasource jndi-name="java:/ENGINEDataSource"
      pool-name="ENGINEDataSource" enabled="true" use-ccm="true">
      
    • Then enable debug of the cached connection manager, same file, after line 191:

      175     <subsystem xmlns="urn:jboss:domain:jca:1.1">
      176       <archive-validation enabled="false" fail-on-error="false" fail-on-warn="false"/>
      177       <bean-validation enabled="true"/>
      178       <default-workmanager>
      179         <short-running-threads>
      180           <core-threads count="50"/>
      181           <queue-length count="50"/>
      182           <max-threads count="50"/>
      183           <keepalive-time time="10" unit="seconds"/>
      184         </short-running-threads>
      185         <long-running-threads>
      186           <core-threads count="50"/>
      187           <queue-length count="50"/>
      188           <max-threads count="50"/>
      189           <keepalive-time time="10" unit="seconds"/>
      190         </long-running-threads>
      191       </default-workmanager>
      192       <cached-connection-manager debug="true"/>        <--- Please add this line in this location
      193     </subsystem>
      
  • After performing the changes restart RHEV-M related services.

  • With this in place if the application forgets to close a database connection it should generate something like below in server.log file:

    Closing a connection for you. Please close them yourself
    
  • This should be accompanied by the stack trace of the place in the code that created the connection, which is very useful to debug it.

Root Cause

Various issues causing the database connection pool to get exhauseted by either engine or one of its services.

Diagnostic Steps

  • User cannot log into the relevant service, no action can be performed.

  • Following messages are logged in the log of engine or relevant service:

    org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory.getLocalManagedConnection(LocalManagedConnectionFactory.java:282): 
                                Caused by: org.postgresql.util.PSQLException: Connection refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.
                                Caused by: java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method) [rt.jar:1.7.0_19]
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) [rt.jar:1.7.0_19]
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) [rt.jar:1.7.0_19]
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) [rt.jar:1.7.0_19]
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:391) [rt.jar:1.7.0_19
    
  • Stat activity from the database shows more than 100 connections opened.

    SELECT * FROM pg_stat_activity;
    

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.