Java process takes a long time with -XX:+AlwaysPreTouch

Solution Verified - Updated -

Environment

  • OpenJDK 11

Issue

  • Running a simple "Hello World" program takes 68 seconds on a 16GB heap when using -XX:+AlwaysPreTouch. For example:
public class Test {
        public static void main(String [] args) {
                System.out.println("Hello World!");
        }
}

date; java -Xmx16G -Xms16G -XX:+AlwaysPreTouch Test; date
Wed Oct 6 12:21:28 EDT 2016
Hello World!
Wed Oct 6 12:22:36 EDT 2016

date; java -Xmx16G -Xms16G Test; date
Wed Oct 6 12:23:38 EDT 2016
Hello World!
Wed Oct 6 12:23:38 EDT 2016

Resolution

Pre-touch was parallelized for the G1 collector in JDK9, JDK-8157952.

If the issue is memory pressure, increase physical memory or decrease system memory demands.

If you want to reduce the amount of caches in the memory, you can adjust the below parameters. It doesn't make a big change, but will reduce the caches if the system is not reading huge amount of file content at once.

Current values in /etc/sysctl.conf:

vm.dirty_background_ratio = 10
vm.dirty_ratio = 20
vm.dirty_writeback_centisecs = 500
vm.dirty_expire_centisecs = 3000

Suggested values in /etc/sysctl.conf

vm.dirty_background_ratio=3
vm.dirty_ratio = 10
vm.dirty_expire_centisecs=500
vm.dirty_writeback_centisecs=100

These values are not the fixed suggested value. You can reduce the number further or increase it based on how it goes with the application.

A quick explanation for each parameters are in the below.

vm.dirty_background_ratio=3
For example if a system has 1000 pages of memory and dirty_background_ratio is set to 3%, writeback will begin when 30 pages have been dirtied.

vm.dirty_ratio = 10
If it is set to 10% on a 1000 page system, a process dirtying pages will be made to wait once the 100th page is dirtied.This mechanism will, thus, slow the dirtying of pages while the system catches up.

vm.dirty_expire_centisecs=500
How long data can be in page cache before being expired:

vm.dirty_writeback_centisecs=100
How often pdflush is activated to clean dirty pages (in hundreths of a second):

Once apply the above changes in /etc/sysctl.conf, please run sysctl -p to apply the changes. You can do it on the fly on a production system.

Root Cause

Scenario Behavior
Without the -XX:+AlwaysPreTouch option The JVM max heap is allocated in virtual memory, not physical memory: it is recorded in an internal data structure to avoid it being used by any other process. Not even a single page will be allocated in physical memory until it's indeed accessed. When the JVM needs memory, the operating system will allocate pages as needed.
With the -XX:+AlwaysPreTouch option The JVM touches every single byte of the max heap size with a '0', resulting in the memory being allocated in the physical memory in addition to being reserved in the internal data structure (virtual memory). Pretouching is single-threaded (except for G1 collector in JDK9+), so it is expected behavior that it causes JVM startup to be delayed. The trade-off is that it will reduce page access time later, as the pages will already be loaded into memory.

As a consequence of AlwaysPreTouch usage, when there is memory pressure, it will cause a delay due to the kernel needing to search all the pages to find out the oldest to reclaim. To drop caches, it needs to be checked which pages are old and not referenced recently. If the pages are recently accessed, it has a higher chance of being used again. So, it needs to check whole pages to find out LRU pages.

Finally in terms of footprint:

  • with +AlwaysPreTouch: AlwaysPreTouch, we would pre-touch committed areas; this affects heap, GC data structures, code heap, Metaspace, and Class space+
  • without +AlwaysPreTouch: user would not pre-touch committed areas; this affects heap, GC data structures, code heap, Metaspace, and Class space+

Diagnostic Steps

To check if memory pressure is causing the delay:

$ echo 3 > /proc/sys/vm/drop_caches
$ sync
$ date; java -Xmx16G -Xms16G Test; date

Use rsar to inspect the sar data in the sosreport to look for memory pressure. For example:

rsar -r sar01 | more

12:00:01 AM kbmemfree kbmemused  %memused kbbuffers  kbcached  kbcommit   %commit
12:10:01 AM   1874168 262619440     99.29    938972 222082916  56889572     21.51
12:20:01 AM   2017612 262475996     99.24    939544 221902732  56920820     21.5
...

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments