Getting SocketTimeoutException error when HotRod Java client trying to reconnect on Red Hat JBoss Data Grid

Solution Verified - Updated -

Environment

  • Red Hat Data Grid
    • 7.3.x
  • Hot Rod Java client

Issue

  • Getting timed out when JDG client tries to reconnect on JDG server
  • After stopping JDG server node, Hot Rod Java client gets blocked and launches a SocketTimeoutException client exception
  • Hot Rod client is not retrying when a server goes down launching Connection reset by peer

Resolution

The Hot Rod client registers a default timeout for operations using the socket timeout (60000 milliseconds by default). However, when this timeout is reached, the retry logic is not invoked causing requests to dead and servers failing immediately without retry.

  1. To solve that issue update the Hot Rod Java client to 7.3.5 version as described in the Fix Version/s: on JIRA JDG-3357
  2. Set the socket_timeout, connect_timeout, max_retries in the client-side configuration.
    This can be done one the hotrod-application.properties file (or hotrod-client.properties), as below:
        infinispan.client.hotrod.socket_timeout = 5000
        infinispan.client.hotrod.connect_timeout = 5000
        infinispan.client.hotrod.max_retries = 5

Or programmatically:

                .maxRetries(5)
                .socketTimeout(80000)
                .connectionTimeout(80000)

The values should be less than the max-idle defined in the clustered.xml file.

Root Cause

This issue was tracked by the following JIRA:
- Upstream community JIRA: ISPN-10429
- Internal product JIRA: JDG-3357

Diagnostic Steps

Search for SocketTimeoutException error in the Hot Rod Java client log as follow:

Exception in thread "main" org.infinispan.client.hotrod.exceptions.TransportException:: java.net.SocketTimeoutException: GetOperation{api-general-filestore, key=[B0x033E03323531, flags=0} timed out after 60000 ms
    at org.infinispan.client.hotrod.impl.Util.rewrap(Util.java:54)
    at org.infinispan.client.hotrod.impl.Util.await(Util.java:27)
    at org.infinispan.client.hotrod.impl.RemoteCacheImpl.get(RemoteCacheImpl.java:418)
    at br.com.itau.jdg.JdgTest.main(JdgTest.java:19)
Caused by: java.net.SocketTimeoutException: GetOperation{api-general-filestore, key=[B0x033E03323531, flags=0} timed out after 60000 ms
    at org.infinispan.client.hotrod.impl.operations.HotRodOperation.run(HotRodOperation.java:172)
    at io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38)
    at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:127)
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
    at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:416)
    at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:331)
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918)
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.