ping timeout in gluster

Solution Verified - Updated -

Environment

Red Hat Gluster Storage 3.X

Issue

Clients can lose connection with the bricks due to ping timeout. This can lead to different behaviors in the applications based on the situation. The most likely error is that applications will start facing the error "Transport endpoint not connected"

Resolution

The ping-timeout value can be set to a higher value in situations when the brick is not responsive to the client for the default 42 seconds.

# gluster volume set <VOL Name> network.ping-timeout <Value>

Root Cause

Network.ping-timeout is the time duration for which the client waits to check if the server is responsive. When a ping timeout happens, there is a network disconnect between the client and server. All resources held by server on behalf of the client get cleaned up. When a reconnection happens, all resources will need to be re-acquired before the client can resume its operations on the server. Additionally, the locks will be acquired and the lock tables updated.

For this to happen properly, we have made the default timeout value to 42 sec based on the various tests we have done and considering other possible conditions. However, this value can be set to a higher or lower value.

Reducing this value to a very low value may trigger false alert even though the server is not down. Suppose, if the system load is a bit high, obviously it will take some time for the system to respond to the read/write requests. And by that time, if the network.ping-timeout is set to a low value, there are chances that it will consider the other server is actually down and issue a frame unwind. Frame unwind means, all the requests from the application, that have been sent to the brick and awaiting a response will be returned back to the application with a error. This leads to application seeing "Transport endpoint not connected" errors for their requests.

A gluster brick can become non responsive to the client's ping request in one of the following situations.

1) The load is too high on the volume and the brick is busy executing the incoming IO requests. In such situation, the brick might not be able to send a response to the client for a period of ping timeout (i.e. by default 42 seconds). The client will not see a response from the brick (either to one of the IO requests that were sent or to a explicit ping request that was sent).

This condition can be further verified by taking the profile info of the asociated gluster volume. Please look at Performance monitoring of RHGS volumes for understanding more about profiling in gluster.

2) The gluster brick process is blocked while doing IO on the backend filesystem. If the thread (called epoll thread) from the brick that receives the IO request from the clients over network is busy waiting for a disk operation to complete for a file/directory, then it will not be able to serve further requests from the network queue until the disk operation on which the epoll thread is waiting, is completed.

This condition can be verified by checking if gluster process is blocked on backend disk filesystem operation. Please look at Backend filesystm and gluster performance.

3) Some network issue, due to which brick's responses does not reach the client.

This condition can be further verified by looking for network related errors in the system log file (/var/log/messages) and taking the tcpdump.

Diagnostic Steps

A ping timeout can be confirmed by looking at the gluster client logs.

NOTE: All the below are logically gluster client

  * Native glusterfs fuse client
  * Native gluster NFS server
  * NFS Ganesha server
  * SAMBA server

Based on the access protocol used, the log files of the above gluster client will indicate if a ping-timeout happened or not.

A ping timeout will have logs like below in the client.

[2017-07-02 08:23:01.178705] I [MSGID: 108026] [afr-self-heal-common.c:1212:afr_log_selfheal] 0-ipat-vmstor0-replicate-0: Completed data selfheal on 5570a524-4fa5-461f-8675-6e32e67e7fe7. sources=[0] 1  sinks=2 
[2017-07-02 08:52:25.008819] C [rpc-clnt-ping.c:160:rpc_clnt_ping_timer_expired] 0-ipat-vmstor0-client-1: server 172.27.22.22:49152 has not responded in the last 10 seconds, disconnecting.
[2017-07-02 08:52:25.009120] I [MSGID: 114018] [client.c:2280:client_rpc_notify] 0-ipat-vmstor0-client-1: disconnected from ipat-vmstor0-client-1. Client process will keep trying to connect to glusterd until brick's port is available
[2017-07-02 08:52:25.009350] E [rpc-clnt.c:365:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7fe5bf987752] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7fe5bf74d8ce] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fe5bf74d9de] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x90)[0x7fe5bf74f150] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x2b0)[0x7fe5bf74fc10] ))))) 0-ipat-vmstor0-client-1: forced unwinding frame type(GlusterFS 3.3) op(WRITE(13)) called at 2017-07-02 08:52:14.257220 (xid=0x7fdf83)
[2017-07-02 08:52:25.009366] W [MSGID: 114031] [client-rpc-fops.c:855:client3_3_writev_cbk] 0-ipat-vmstor0-client-1: remote operation failed [Transport endpoint is not connected]
[2017-07-02 08:52:25.009416] W [MSGID: 114031] [client-rpc-fops.c:1830:client3_3_fxattrop_cbk] 0-ipat-vmstor0-client-1: remote operation failed
[2017-07-02 08:52:25.009578] E [rpc-clnt.c:365:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7fe5bf987752] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7fe5bf74d8ce] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fe5bf74d9de] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x90)[0x7fe5bf74f150] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x2b0)[0x7fe5bf74fc10] ))))) 0-ipat-vmstor0-client-1: forced unwinding frame type(GlusterFS 3.3) op(LOOKUP(27)) called at 2017-07-02 08:52:14.257298 (xid=0x7fdf84)
[2017-07-02 08:52:25.009595] W [MSGID: 114031] [client-rpc-fops.c:2938:client3_3_lookup_cbk] 0-ipat-vmstor0-client-1: remote operation failed. Path: / (00000000-0000-0000-0000-000000000001) [Transport endpoint is not connected]
[2017-07-02 08:52:25.009757] E [rpc-clnt.c:365:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7fe5bf987752] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7fe5bf74d8ce] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fe5bf74d9de] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x90)[0x7fe5bf74f150] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x2b0)[0x7fe5bf74fc10] ))))) 0-ipat-vmstor0-client-1: forced unwinding frame type(GlusterFS 3.3) op(READ(12)) called at 2017-07-02 08:52:14.257555 (xid=0x7fdf85)
[2017-07-02 08:52:25.009785] W [MSGID: 114031] [client-rpc-fops.c:3002:client3_3_readv_cbk] 0-ipat-vmstor0-client-1: remote operation failed [Transport endpoint is not connected]
[2017-07-02 08:52:25.009845] W [MSGID: 114031] [client-rpc-fops.c:1495:client3_3_fstat_cbk] 0-ipat-vmstor0-client-1: remote operation failed [Transport endpoint is not connected]
[2017-07-02 08:52:25.009859] E [MSGID: 114031] [client-rpc-fops.c:1601:client3_3_finodelk_cbk] 0-ipat-vmstor0-client-1: remote operation failed [Transport endpoint is not connected]
[2017-07-02 08:52:25.010002] E [rpc-clnt.c:365:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7fe5bf987752] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7fe5bf74d8ce] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fe5bf74d9de] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x90)[0x7fe5bf74f150] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x2b0)[0x7fe5bf74fc10] ))))) 0-ipat-vmstor0-client-1: forced unwinding frame type(GF-DUMP) op(NULL(2)) called at 2017-07-02 08:52:15.007506 (xid=0x7fdf86)
[2017-07-02 08:52:25.010016] W [rpc-clnt-ping.c:203:rpc_clnt_ping_cbk] 0-ipat-vmstor0-client-1: socket disconnected

WHen the above logs are seen, we can confirm from the corresponding node. We have to check the logs of the associated brick on that node around the same time.

i.e. in this situation we have to check the logs of the bricks running on the node 172.27.22.22 around the time "2017-07-02 08:52:25".

The brick logs in this case had a disconnection message for the corresponding client

[2017-07-02 08:52:33.409266] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ipat-vmstor0-server: Shutting down connection node.private-4808-2017/06/14-16:29:11:242270-ipat-vmstor0-client
-1-0-14

Mainly checking the gluster client logs will be helpful in determining whether ping timeout happened or not.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.