Satellite server not sending latest Errata patches or package updates to clients: Taskomatic and tomcat java heap dumps "OutOfMemoryError"

Solution Verified - Updated -

Environment

  • Red Hat Satellite 5.x

Issue

  • Red Hat Satellite 5 server is producing java heap dumps and it is causing to receive 503 errors on the Satellite web interface in addition to clogging our inboxes
  • Heap dumps are being put into system's /usr directory and thus filling up the root filesystem
  • Clients are failing to sync, update through yum, or receive latest errata patches from Satellite server due to taskomatic or tomcat out of memory errors on server
  • Satellite server slow or throwing out of memory exceptions after upgrade to 5.8
  • Trying to update client system, receiving "Error while executing packages action: empty transaction" messages from Satellite
  • Satellite server is not syncing yum repositories due to OOM exceptions
  • Taskomatic logs reflects errror:- java.lang.OutOfMemoryError
  • Catalina.out reflects error:- ERROR com.redhat.rhn.common.messaging.ActionExecutor - java.lang.OutOfMemoryError
  • Channel Repodata showing RUNNING status since long time.

Resolution

  • There problem was addressed by the errata RHBA-2014-1651. Make sure you have applied it.

  • In some cases depending on the workload, it may be still necessary to increase the JVM heap space. You can do that by adding the following values in /etc/rhn/rhn.conf:

    # Initial Java Heap Size (in MB)
    taskomatic.java.initmemory=512
    
    # Maximum Java Heap Size (in MB)
    taskomatic.java.maxmemory=1512
    
  • Restart satellite service:

    #  rhn-satellite restart
    
  • If you are experiencing issues with Tomcat, you can consider increasing the maximum heap size from 256Mb to 512Mb.

  • On the satellite server edit file/etc/sysconfig/tomcat6 and increase the -Xmx to a higher value such as 512Mb here:

    JAVA_OPTS="$JAVA_OPTS -ea -Xms256m -Xmx512m -Djava.awt.headless=true -Dorg.xml.sax.driver=org.apache.xerces.parsers.SAXParser -XX:MaxNewSize=256 -XX:-UseConcMarkSweepGC"
    
  • Restart the tomcat daemon:

        #  service tomcat6 restart
    
  • To have future heap dumps placed in a different directory to /usr, follow the instructions in this article: How do I change the directory where java heapdumps are written?

Root Cause

  • Currently, the minimum size is set to 256Mb and the maximum size is set to 512Mb in /etc/rhn/default/rhn_taskomatic_daemon.conf . Java typically uses 500Mb of the heap dump alone. Running out of memory on the heap in a java process (in this case taskomatic) can cause this error.
  • Above configuration file directory location in Satellite 5.8 is /usr/share/rhn/config-defaults/

  • Currently, the minimum and the maximum size is set to 256Mb in /etc/tomcat5/tomcat5.conf

  • This error is documented fully under Java application "java.lang.OutOfMemoryError: GC overhead limit exceeded".

  • This issue was reported on bugzilla #1132398.

Diagnostic Steps

1TISIGINFO     Dump Event "systhrow" (00040000) Detail "java/lang/OutOfMemoryError" received
...
1CICMDLINE     /usr/bin/java -Dibm.dst.compatibility=true -Xms256m -Xmx512m -Djava.library.path=/usr/lib:/usr/lib64:/usr/lib/oracle/10.2.0.4/client64/lib:/usr ... -Dwrapper.jvmid=5 com.redhat.rhn.taskomatic.core.TaskomaticDaemon
...
2CIUSERARG               -Xms256m
2CIUSERARG               -Xmx512m

Here we can see that the JVM for 'com.redhat.rhn.taskomatic.core.TaskomaticDaemon' is encountering an OutOfMemory error and that the current memory limits for the JVM heap are 256MB for the startup size and 512MB for the maximum allowed size. Increasing the -Xmx setting will hopefully resolve the issue.
...

  • for tomcat we can see that java core file contains:
1CICMDLINE     /usr/lib/jvm/java/bin/java -Dcatalina.ext.dirs=/usr/share/tomcat5/shared/lib:/usr/share/tomcat5/common/lib -Djavax.sql.DataSource.Factory=org.apache.commons.dbcp.BasicDataSourceFactory -ea -Xms256m -Xmx256m -Djava.awt.headless=true -Dorg.xml.sax.driver=org.apache.xerces.parsers.SAXParser -XX:MaxNewSize=256 -XX:-UseConcMarkSweepGC -Dcatalina.ext.dirs=/usr/share/tomcat5/shared/lib:/usr/share/tomcat5/common/lib -Djavax.sql.DataSource.Factory=org.apache.commons.dbcp.BasicDataSourceFactory -Djava.endorsed.dirs=/usr/share/tomcat5/common/endorsed -classpath /usr/lib/jvm/java/lib/tools.jar:/usr/share/tomcat5/bin/bootstrap.jar:/usr/share/tomcat5/bin/commons-logging-api.jar:/usr/share/java/mx4j/mx4j-impl.jar:/usr/share/java/mx4j/mx4j-jmx.jar -Dcatalina.base=/usr/share/tomcat5 -Dcatalina.home=/usr/share/tomcat5 -Djava.io.tmpdir=/usr/share/tomcat5/temp org.apache.catalina.startup.Bootstrap start

2CIUSERARG               -Xms256m
2CIUSERARG               -Xmx256m

Here we can see that the min and max JVM heap limit are set to 256Mb. To resolve the issue increase the max heap limit -Xmx to 512Mb.
* The messages in the /var/log/message file are:

/var/log/messages.1:Jan  9 01:52:15  java[27378]: JVMDUMP032I JVM requested Java dump using '/tmp/javacore.20130109.015204.27378.0002.txt' in response to an event 
/var/log/messages.1:Jan  9 01:52:15  java[27378]: JVMDUMP032I JVM requested Snap dump using '/usr/share/tomcat5/Snap.20130109.015204.27378.0003.trc' in response to an event 
/var/log/messages.1:Jan  9 01:52:16  java[27378]: JVMDUMP032I JVM requested Heap dump using '/tmp/heapdump.20130109.015216.27378.0004.phd' in response to an event  
  • java.lang.OutOfMemoryError exception in /var/log/tomcat5/catalina.out file:
JVMDUMP032I JVM requested Java dump using '/tmp/javacore.20130109.121341.10405.0011.txt' in response to an event
JVMDUMP010I Java dump written to /tmp/javacore.20130109.121341.10405.0011.txt
JVMDUMP032I JVM requested Snap dump using '/usr/share/tomcat5/Snap.20130109.121341.10405.0012.trc' in response to an event
JVMDUMP030W Cannot write dump to file /usr/share/tomcat5/Snap.20130109.121341.10405.0012.trc: Permission denied
JVMDUMP010I Snap dump written to /tmp/Snap.20130109.121341.10405.0012.trc
JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError".
Exception in thread "ContainerBackgroundProcessor[StandardEngine[Catalina]]" java.lang.OutOfMemoryError
    at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1572)
    at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run(Exception in thread "Thread-7" ContainerBase.java:1559)
    at java.lang.Thread.run(Thread.java:736)
java.lang.OutOfMemoryError
    at java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:777)
    at java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:771)
    at java.lang.Thread.uncaughtException(Thread.java:1213)
2013-01-09 12:13:44,158 [RHN Message Dispatcher] ERROR com.redhat.rhn.common.messaging.ActionExecutor - java.lang.OutOfMemoryError
  • Exception in /var/log/rhn/rhn_taskomatic_daemon.log file
INFO   | jvm 28   | 2013/10/03 14:05:15 | JVMDUMP006I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" - please wait.
INFO   | jvm 28   | 2013/10/03 14:05:15 | JVMDUMP006I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" - please wait.
INFO   | jvm 28   | 2013/10/03 14:05:15 | JVMDUMP032I JVM requested Heap dump using '/usr/sbin/heapdump.20131003.140515.9571.0010.phd' in response to an event
INFO   | jvm 28   | 2013/10/03 14:05:20 | JVMDUMP010I Heap dump written to /usr/sbin/heapdump.20131003.140515.9571.0010.phd
INFO   | jvm 28   | 2013/10/03 14:05:20 | JVMDUMP032I JVM requested Snap dump using '/usr/sbin/Snap.20131003.140515.9571.0012.trc' in response to an event
INFO   | jvm 28   | 2013/10/03 14:05:20 | JVMDUMP010I Snap dump written to /usr/sbin/Snap.20131003.140515.9571.0012.trc
INFO   | jvm 28   | 2013/10/03 14:05:20 | JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError".
INFO   | jvm 28   | 2013/10/03 14:05:20 | JVMDUMP032I JVM requested Java dump using '/usr/sbin/javacore.20131003.140515.9571.0011.txt' in response to an event
INFO   | jvm 28   | 2013/10/03 14:05:20 | JVMDUMP010I Java dump written to /usr/sbin/javacore.20131003.140515.9571.0011.txt
INFO   | jvm 28   | 2013/10/03 14:05:20 | JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError".
INFO   | jvm 28   | 2013/10/03 14:05:20 | Exception in thread "Thread-49" java.lang.OutOfMemoryError
  • Taskomatic reports java.lang.OutOfMemoryError: GC overhead limit exceeded
  • After downloading a new channel the following error is reported in taskomatic logs:
INFO   | jvm 1    | 2013/11/22 11:00:17 | 2013-11-22 11:00:17,134 [Thread-53] INFO  com.redhat.rhn.taskomatic.task.repomd.RepositoryWriter - Generating new repository metadata for channel 'rhel-x86_64-server-5'(sha1) 15661 packages, 2856 errata
INFO   | jvm 1    | 2013/11/22 11:00:50 | Exception in thread "Thread-53" java.lang.OutOfMemoryError: GC overhead limit exceeded

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

19 Comments

I was facing the same issue with Red Hat Satellite 5.6.
I've created several cloned of the RHEL base channels (+10000 packages), put systems in these cloned channels but wa unable to use yum on these systems.

Each time I've got the following message:
Error: Cannot retrieve repository metadata (repomd.xml) for repository: qa-rhel-x86_64-server-6. Please verify its path and try again

After investigation, I've seen in the /var/log/rhn/rhn_taskomatic_daemon.log on the satellite the following message:
Exception in thread "Thread-58" java.lang.OutOfMemoryError: GC overhead limit exceeded
INFO | jvm 1 | 2014/04/17 10:35:14 | at java.lang.reflect.Method.copy(Method.java:151)
INFO | jvm 1 | 2014/04/17 10:35:14 | at java.lang.reflect.ReflectAccess.copyMethod(ReflectAccess.java:136)
INFO | jvm 1 | 2014/04/17 10:35:14 | at sun.reflect.ReflectionFactory.copyMethod(ReflectionFactory.java:300)
INFO | jvm 1 | 2014/04/17 10:35:14 | at java.lang.Class.copyMethods(Class.java:2852)
INFO | jvm 1 | 2014/04/17 10:35:14 | at java.lang.Class.getMethods(Class.java:1467)
INFO | jvm 1 | 2014/04/17 10:35:14 | at com.redhat.rhn.common.db.datasource.CachedStatement.addToObject(CachedStatement.java:739)
INFO | jvm 1 | 2014/04/17 10:35:14 | at com.redhat.rhn.common.db.datasource.CachedStatement.processResultSet(CachedStatement.java:626)
INFO | jvm 1 | 2014/04/17 10:35:14 | at com.redhat.rhn.common.db.datasource.CachedStatement.execute(CachedStatement.java:474)
INFO | jvm 1 | 2014/04/17 10:35:14 | at com.redhat.rhn.common.db.datasource.CachedStatement.execute(CachedStatement.java:443)
INFO | jvm 1 | 2014/04/17 10:35:14 | at com.redhat.rhn.common.db.datasource.CachedStatement.execute(CachedStatement.java:345)
INFO | jvm 1 | 2014/04/17 10:35:14 | at com.redhat.rhn.common.db.datasource.CachedStatement.execute(CachedStatement.java:351)
INFO | jvm 1 | 2014/04/17 10:35:14 | at com.redhat.rhn.common.db.datasource.CachedStatement.execute(CachedStatement.java:287)
INFO | jvm 1 | 2014/04/17 10:35:14 | at com.redhat.rhn.common.db.datasource.SelectMode.execute(SelectMode.java:110)
INFO | jvm 1 | 2014/04/17 10:35:14 | at com.redhat.rhn.manager.task.TaskManager.getChannelPackageDtos(TaskManager.java:52)
INFO | jvm 1 | 2014/04/17 10:35:14 | at com.redhat.rhn.taskomatic.task.repomd.RpmRepositoryWriter.writeRepomdFiles(RpmRepositoryWriter.java:170)
INFO | jvm 1 | 2014/04/17 10:35:14 | at com.redhat.rhn.taskomatic.task.repomd.ChannelRepodataWorker.run(ChannelRepodataWorker.java:104)
INFO | jvm 1 | 2014/04/17 10:35:14 | at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(PooledExecutor.java:756)
INFO | jvm 1 | 2014/04/17 10:35:14 | at java.lang.Thread.run(Thread.java:744)

The satellite JVM seems to run out of memory.
So I've decided to double the wrapper.java.maxmemory value from 1024 (default value) to 2048 into the /usr/share/rhn/config-defaults/rhn_taskomatic_daemon.conf:

Maximum Java Heap Size (in MB)

wrapper.java.maxmemory=2048

Then I've restarted the taskomastic task:

service taskomatic restart

I haven't touch the /etc/tomcat6/tomcat6.conf file because no Xmx parameter configured before.
And now I can see no more problem with java in the taskomatic log:
INFO | jvm 1 | 2014/04/17 10:45:57 | 2014-04-17 10:45:57,705 [Thread-53] INFO com.redhat.rhn.taskomatic.task.repomd.RepositoryWriter - Generating new repository metadata for channel 'qa-rhel-x86_64-server-6'(sha256) 12439 packages, 2340 errata
INFO | jvm 1 | 2014/04/17 10:48:02 | 2014-04-17 10:48:01,950 [Thread-53] INFO com.redhat.rhn.taskomatic.task.repomd.RepositoryWriter - Repository metadata generation for 'qa-rhel-x86_64-server-6' finished in 124 seconds
INFO | jvm 1 | 2014/04/17 10:48:03 | 2014-04-17 10:48:03,109 [Thread-53] INFO com.redhat.rhn.taskomatic.task.repomd.RepositoryWriter - File Modified Date:2014-03-26 14:35:05 CET

I've found a clue about that issue in the following message:
https://www.redhat.com/archives/spacewalk-list/2013-March/msg00120.html

Regards.

Thanks Vincent, its cool info.

I started having an issue with the repodata being generated but not being created by the taskomatic for the past few weeks. I've been checking the taskomatic log, tail -f /var/log/rhn/rhn_taskomatic_daemon.log, but today I finally saw the java error which pointed me here, and the resolution worked.

One suggestion, the satellite-restart will restart tomcat, so why do the java option changes after and restart tomcat again, when it can be done once? That is what I did and it worked.

I just ran into this today, wondering why an errata was not getting picked up during a system update.

We were seeing this issue with our 5.6 Satellite server. The resolution resolved our issues.

tsc sysadmin: Same here...running 5.6. I simply followed Vincent Alloy's direction, bounced taskomatic, and PRESTO ... back in business.

Red Hat - Perhaps take a look at the defaults and either adjust them; or let us know what else we should be doing.

Ran into this today on Satellite 5.6.0. Changed wrapper.java.maxmemory=2048 in /usr/share/rhn/config-defaults/rhn_taskomatic_daemon.conf and -Xmx512m in /etc/sysconfig/tomcat6 and then restarted satellite. I needed to do a 'rhn-profile-sync' on one of the clients since 'yum update' wasn't showing any available updates yet Satellite said there were some. Haven't had any trouble with any of the other clients. Working good now.

I suppose a preview of things to come -- in the spacewalk-list we had a similar conversation, and it was pointed out that in the current version of spacewalk you would make adjustments to values found in /etc/rhn/rhn.conf...HOWEVER, that apparently is not yet the case for Satellite at the moment. In the future it looks like we'll need to move our current config adjustment from the location outlined in this thread to the new one...but not yet! Maybe a Satellite 6.x adjustment?

I'll keep following this thread in the meantime.

All,

Take a look above, the article has been amended with the proper configuration adjustments. I'll be updating my personal install notes.

I've used this before and it solved the problem but I'm getting the issue again.

Hello,

I also set the values proposed by Redhat some month ago. But obviously some update replaced the increased heap size with the old values again.

CR

Filed 01228952 with Redhat due to this still being an issue with Satellite 5.6 using default values. I surely do hope that package updates to the Satellite server will not roll over my values and that Redhat is more realistic in these settings next time around.

/usr/share/rhn/config-defaults/rhn_taskomatic_daemon.conf:

Initial Java Heap Size (in MB)

-wrapper.java.initmemory=256
+wrapper.java.initmemory=1024

Maximum Java Heap Size (in MB)

-wrapper.java.maxmemory=1024
+wrapper.java.maxmemory=4096

There is far too much foolishness like this to deal with when coping with Satellite 5.x. I really hope Satellite 6 is less annoying.

Thx for this article saved my day.

Hope after next update values will be by default correct.

If your satellite servers are not facing the public internet, and are disconnected, and you download iso channel dumps and ingest them, you will not get the spacewalk-java updates.. I created a discussion at this link. Now I know to manually download the spacewalk-java rpms. More details in that discussion.

Fixed the issue. Thanks!

It seems that in 5.7.0, you can now specify the max memory for taskomatic in the /etc/rhn/rhn.conf with this parameter "taskomatic.maxmemory". The value an integer (MB). I just tested it and it seems to be working so far.

From the man pages:

taskomatic.maxmemory (integer)
       The maximum abount of memory (MB) that Taskomaic can use.
       If you find that Taskomatic is running out of memory,  consider increasing this.

       Default: 1024

Just upgraded our satellite server from 5.4.1 to 5.6.0 and experienced the java errors listed here. This was causing the taskomatic process to not fully complete and resulted in missing repomd.xml errors. After searching red hat knowledgebase we eventually came across this page and the suggested fix has now rectified our problem. Changed /usr/share/rhn/config-defaults/rhn_taskomatic_daemon.conf value to "wrapper.java.maxmemory=4096" and added "-Xmx512m" as per the example to the /etc/sysconfig/tomcat5 file then ran "rhn-satellite restart" while tailing the rhn_taskomatic_daemon.log. Give it some time to complete if you have lots of channels, ours took 90 minutes. All good.
Cheers to all for your help

We've just encountered a similar - though different - issue on the latest 5.6: We had timeouts with rhnreg_ks, 503 errors in the Web UI, and some extremely slow database operations.

Eventually we discovered that a load of crashdumps were lying around in /usr, reporting an OutOfMemory error on the rhn-search daemon - it needs more than 1 GB in our case (fixed it via wrapper.java.maxmemory in /usr/share/rhn/config-defaults/rhn_search_daemon.conf).

This article came up while searching for the memory limits (and the initial search for our symptoms came up empty), so the note goes here.

This may be old news but just encountered this on a 5.8 system. We upgraded from 5.6 to 5.8 a few months back, and this started occurring. This fixed it right up. However my CPU count has sky rocketed now. Working with support so hopefully it gets fixed. Odd this is not in the upgrade documentation or an errata to fix.