Red Hat Training

A Red Hat training course is available for Red Hat JBoss Data Virtualization

Chapter 9. Performance Tuning

9.1. Memory Management Considerations

Here is some information about settings related to memory management.
Red Hat JBoss Data Virtualization uses a BufferManager to track both memory and disk usage. Configuring the BufferManager properly is one of the most important parts of ensuring high performance. In most instances though the default settings are sufficient as they will scale with the JVM and consider other properties such as the setting for max active plans. Execute the following command using the CLI to find all of the possible settings for the BufferManager:
/subsystem=teiid:read-resource
All of the properties that start with "buffer-service" can be used to configure BufferManager. Here are the CLI commands you can use to change these settings:
/subsystem=teiid:write-attribute(name=buffer-service-use-disk,value=true)
/subsystem=teiid:write-attribute(name=buffer-service-encrypt-files,value=false)
/subsystem=teiid:write-attribute(name=buffer-service-processor-batch-size,value=256)
/subsystem=teiid:write-attribute(name=buffer-service-max-open-files,value=64)
/subsystem=teiid:write-attribute(name=buffer-service-max-file-size,value=2048)
/subsystem=teiid:write-attribute(name=buffer-service-max-processing-kb,value=-1)
/subsystem=teiid:write-attribute(name=buffer-service-max-reserve-kb,value=-1)
/subsystem=teiid:write-attribute(name=buffer-service-max-buffer-space,value=51200)
/subsystem=teiid:write-attribute(name=buffer-service-max-inline-lobs,value=true)
/subsystem=teiid:write-attribute(name=buffer-service-memory-buffer-space,value=-1)
/subsystem=teiid:write-attribute(name=buffer-service-max-storage-object-size,value=8388608)
/subsystem=teiid:write-attribute(name=buffer-service-memory-buffer-off-heap,value=false)

Warning

Do not change these properties until you understand the properties and any potential issue that is being experienced.
Here is an explanation of the main settings:
  • max-reserve-kb (default -1) - setting determines the total size in kilobytes of batches that can be held by the BufferManager in memory. This number does not account for persistent batches held by soft (such as index pages) or weak references. The default value of -1 will auto-calculate a typical max based upon the max heap available to the VM. The auto-calculated value assumes a 64bit architecture and will limit buffer usage to 40% of the first gigabyte of memory beyond the first 300 megabytes (which are assumed for use by the AS and other Red Hat JBoss Data Virtualization purposes) and 50% of the memory beyond that. The additional caveat here is that if the size of the memory buffer space is not specified, then it will effectively be allocated out of the max reserve space. A small adjustment is also made to the max reserve to account for batch tracking overhead.
    With default settings and an 8GB VM size, then max-reserve-kb will at a max use: (((1024-300) * 0.4) + (7 * 1024 * 0.5)) = 4373.6 MB or 4,478,566 KB
  • The BufferManager automatically triggers the use of a canonical value cache if enabled when more than 25% of the reserve is in use. This can dramatically cut the memory usage in situations where similar value sets are being read through Red Hat JBoss Data Virtualization but does introduce a lookup cost. If you are processing small or highly similar datasets through Red Hat JBoss Data Virtualization and wish to conserve memory, you should consider enabling value caching.
    Memory consumption can be significantly more or less than the nominal target depending upon actual column values and whether value caching is enabled. Large non built-in type objects can exceed their default size estimate. If an out of memory errors occur, then set a lower max-reserve-kb value. Also note that source lob values are held by memory references that are not cleared when a batch is persisted. With heavy LOB usage you should ensure that buffers of other memory associated with lob references are appropriately sized.
  • max-processing-kb (default -1) - setting determines the total size in kilobytes of batches that can be guaranteed for use by one active plan and may be in addition to the memory held based on max-reserve-kb. Typical minimum memory required by Red Hat JBoss Data Virtualization when all the active plans are active is #active-plans*max-processing-kb. The default value of -1 will auto-calculate a typical max based upon the max heap available to the VM and max active plans. The auto-calculated value assumes a 64bit architecture and will limit nominal processing batch usage to less than 10% of total memory.
    With default settings including 20 active-plans and an 8GB VM size, then max-processing-kb will be: (.07 * 8 * 1024)/20^.8 = 537.4 MB/11 = 52.2 MB or 53,453 KB per plan. This implies a nominal range between 0 and 1060 MB that may be reserved with roughly 53 MB per plan. You should be cautious in adjusting max-processing-kb on your own. Typically it will not need adjusted unless you are seeing situations where plans seem memory constrained with low performing large sorts.
  • max-file-size (default 2GB) - Each intermediate result buffer, temporary LOB, and temporary table is stored in its own set of buffer files, where an individual file is limited to max-file-size megabytes. Consider increasing the storage space available to all such files by increasing max-buffer-space, if your installation makes use of internal materialization, makes heavy use of SQL/XML, or processes large row counts.
  • processor-batch-size (default 256) - Specifies the target row count of a batch of the query processor. A batch is used to represent both linear data stores, such as saved results, and temporary table pages. Teiid will adjust the processor-batch-size to a working size based upon an estimate of the data width of a row relative to a nominal expectation of 2KB. The base value can be doubled or halved up to three times depending upon the data width estimation. For example a single small fixed width (such as an integer) column batch will have a working size of processor-batch-size * 8 rows. A batch with hundreds of variable width data (such as string) will have a working size of processor-batch-size / 8 rows. Any increase in the processor batch size beyond the first doubling should be accompanied with a proportional increase in the max-storage-object-size to accommodate the larger storage size of the batches.
    Additional considerations are needed if large VM sizes and/or datasets are being used. Red Hat JBoss Data Virtualization has a non-negligible amount of overhead per batch/table page on the order of 100-200 bytes. If you are dealing with datasets with billions of rows and you run into OutOfMemory issues, consider increasing the processor-batch-size to force the allocation of larger batches and table pages. A general guideline would be to double processor-batch-size for every doubling of the effective heap for Red Hat JBoss Data Virtualizationbeyond 4 GB - processor-batch-size = 512 for an 8 GB heap, processor-batch-size = 1024 for a 16 GB heap, etc.
  • max-storage-object-size (default 8288608 or 8MB) - The maximum size of a buffered managed object in bytes and represents the individual batch page size. If the processor-batch-size is increased and/or you are dealing with extremely wide result sets (several hundred columns), then the default setting of 8MB for the max-storage-object-size may be too low. The inline-lobs setting also can increase the size of batches containing small lobs. The sizing for max-storage-object-size is in terms of serialized size, which will be much closer to the raw data size than the Java memory footprint estimation used for max-reserved-kb. max-storage-object-size should not be set too large relative to memory-buffer-space since it will reduce the performance of the memory buffer. The memory buffer supports only 1 concurrent writer for each max-storage-object-size of the memory-buffer-space. Note that this value does not typically need to be adjusted unless the processor-batch-size is adjusted, in which case consider adjusting it in proportion to the increase of the processor-batch-size.
    If exceptions occur related to missing batches and "TEIID30001 Max block number exceeded" is seen in the server log, then increase the max-storage-object-size to support larger storage objects. Alternatively you could make the processor-batch-size smaller. memory-buffer-space (default -1) - This controls the amount of on or off heap memory allocated as byte buffers for use by the Red Hat JBoss Data Virtualization buffer manager measured in megabytes. This setting defaults to -1, which automatically determines a setting based upon whether it is on or off heap and the value for max-reserve-kb. The memory buffer supports only 1 concurrent writer for each max-storage-object-size of the memory-buffer-space. Any additional space serves as a cache for the serialized for of batches.
    When left at the default setting the calculated memory buffer space will be approximately 40% of the max-reserve-kb size. If the memory buffer is on heap and the max-reserve-kb is automatically calculated, then the memory buffer space will be subtracted out of the effective max-reserve-kb. If the memory buffer is off heap and the max-reserve-kb is automatically calculated, then its size will be reduced slightly to allow for effectively more working memory in the vm. memory-buffer-off-heap (default false) - Take advantage of the BufferManager memory buffer to access system memory without allocating it to the heap. Setting memory-buffer-off-heap to "true" will allocate the Red Hat JBoss Data Virtualization memory buffer off heap. Depending on whether your installation is dedicated to Red Hat JBoss Data Virtualization and the amount of system memory available, this may be preferable to on-heap allocation. The primary benefit is additional memory usage for Red Hat JBoss Data Virtualization without additional garbage collection tuning. This becomes especially important in situations where more than 32GB of memory is desired for the VM. Note that when using off-heap allocation, the memory must still be available to the java process and that setting the value of memory-buffer-space too high may cause the VM to swap rather than reside in memory. With large off-heap buffer sizes (greater than several gigabytes) you may also need to adjust VM settings.
For Sun VMs the relevant VM settings are MaxDirectMemorySize and UseLargePages. Here is an example:
-XX:MaxDirectMemorySize=12g -XX:+UseLargePages
Adding this to the VM process arguments would allow for an effective allocation of approximately an 11GB Red Hat JBoss Data Virtualization memory buffer (the memory-buffer-space setting) accounting for any additional direct memory that may be needed by Red Hat JBoss EAP or applications running within EAP.