What is rcv_space in the 'ss --info' output, and why it's value is larger than net.core.rmem_max

Latest response

Hi,
can anyone help me understand what is rcv_space in the 'ss --info output', e.g.:

#  ss --info dst 1.1.1.1
State      Recv-Q Send-Q                                            Local Address:Port                                                Peer Address:Port   
ESTAB      0      0                                                 2.2.2.2:39464                                              1.1.1.1:ssh     
         cubic wscale:7,0 rto:204 rtt:7.5/3 ato:40 cwnd:10 ssthresh:15 send 15.4Mbps rcv_rtt:4 rcv_space:517488

Second question is what units rcv_space is measured in. I noticed that that it value is always devidable by 8.

Third question is why is rcv_space larger than the limits net.core.rmem_max and net.ipv4.tcp_rmem

net.core.wmem_max = 131071
net.core.rmem_max = 13107
net.core.wmem_default = 229376
net.core.rmem_default = 229376
net.ipv4.tcp_mem = 7270 9693    14540
net.ipv4.tcp_wmem = 4096        16384   4194304
net.ipv4.tcp_rmem = 4096        8192    16384

Fourth question is how can one verify that the above sysctl parameters take effect. I thought ss would do it, but now I am not so sure.

Thanks!

References:
TCP(7)
/usr/share/doc/kernel-doc-2.6.32/Documentation/networking/ip-sysctl.txt
/usr/share/doc/kernel-doc-2.6.32/Documentation/sysctl/net.txt
/usr/include/netinet/tcp.h

Responses

rcv_space is used in TCP's internal auto-tuning to grow socket buffers based on how much data the kernel estimates the sender can send. It will change over the life of any connection. It's measured in bytes. You can see where the value is populated by reading the tcp_get_info() function in the kernel.

The value is not measuring the actual socket buffer size, which is what net.ipv4.tcp_rmem controls. You'd need to call getsockopt() within the application to check the buffer size. You can see current buffer usage with the Recv-Q and Send-Q fields of ss.

Note that if the buffer size is set with setsockopt(), the value returned with getsockopt() is always double the size requested to allow for overhead. This is described in man 7 socket.

You can check the value of kernel tunables with sysctl tunable.name, for example sysctl net.ipv4.tcp_rmem, or you can just run sysctl -a to see all the tunables.

I find it easier to just see all of them and grep, like sysctl -a | egrep tcp..*mem

Your net.ipv4.tcp_mem tunable is quite interesting. Is this on an embedded system with really low RAM or something? You're collectively giving all TCP sockets a mere ~56kb of memory. Given your default of 8kb per socket, that's only seven sockets before you hit TCP memory pressure. This tunable is explained in ip-sysctl.txt which you referenced.

I wish this forum had mod points, this post deserves them!

Thanks Jamie.

Hi Jamie,
thank you for the answer.
This is a normal system with 32G RAM. I forced net.ipv4.tcp_mem to a very low value to see whether it would negatively affect the speed of the ongoing rsync transfer. And it didn't. I am puzzled.
One explanation could be that this affects only new memory allocations for new programs. But I am able to establish more than 4 new TCP connections.

Try this yourself on your test machine. Set sysctl net.ipv4.tcp_mem="4096 4096 14000". This should limit the OS to maximum 3 TCP connections. And then try to open another connection.
Below is my experiment:
[root@host-10-10-10-22 ~]# lsof -PniTCP
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
sshd 1147 root 3u IPv4 11396 0t0 TCP *:22 (LISTEN)
sshd 1147 root 4u IPv6 11401 0t0 TCP *:22 (LISTEN)
sshd 1220 root 3r IPv4 12125 0t0 TCP 10.10.10.22:22->10.11.254.252:45192 (ESTABLISHED)
[root@host-10-10-10-22 ~]# sysctl net.ipv4.tcp_mem="4096 4096 12000"
net.ipv4.tcp_mem = 4096 4096 12000
[root@host-10-10-10-22 ~]# nc -l 8080
Hello,
as you can see it is possible to establish more than 3 connections. net.ipv4.tcp_mem should've limited us to 3 connections, but it didn't
[root@host-10-10-10-22 ~]#

The "Hello, ..." sentence above was typed in from a different machine.

Ah, please accept my apologies, I miscalculated.

14540 pages is 58160 kilobytes, which is ~56Mb. So you've got significantly more than 3 sockets before you hit memory pressure! It would be more like 3635 sockets, with a total maximum somewhere between 3635 sockets and 14540 sockets.

If the transfer is sufficiently slow or not-latent that the BDP never exceeds 16kb, then I wouldn't expect you'd see a big performance hit. An rsync is likely to be limited by the speed of the disk, so assuming 200Mbps (disk speed) at 0.5ms (LAN latency) that's a BDP of only 12.5 kilobytes. (and I double checked my figures this time;)

I got two virtual machines on the same hypervisor and shrunk the buffer sizes down to sysctl net.ipv4.tcp_rmem="4096 4096 16384" and could only get about 650Mbps between them. Growing the buffer sizes to a decent sysctl net.ipv4.tcp_rmem="4096 262144 4194304" the transfer speed rose to ~12Gbps. So that does show the small socket buffer size will restrict transfer speed. I used iperf for these tests, so the transfer was not bottlenecked by a slow backing disk.

I kinda consider the net.ipv4.tcp_mem tunable irrelevant on enterprise hardware. Nobody spends tens of thousands of dollars on servers with gigs and gigs of RAM so they can run a business application which requires network communication, then split hairs over a few megabytes (or even hundreds of megabytes) of memory usage.

Sure it's got its place in Linux which is used on a massive variety of hardware both large and small, but for large memory systems like most Red Hat customers own I usually just recommend to set net.ipv4.tcp_mem big and forget about it.

Again, sorry for leading you up the garden path with my incorrect calculations.

hi ,jamia:

can you help me to explan the output of ss -itm : ss -itm |grep -A1 "10.96.18.40:ftp" | tail -2 ESTAB 0 0 ::ffff:10.96.18.40:ftp ::ffff:10.96.168.229:62079
skmem:(r0,rb369280,t0,tb87040,f0,w0,o0,bl0) cubic wscale:0,7 rto:224 rtt:23.196/29.89 ato:40 mss:1460 cwnd:10 ssthresh:16 send 5.0Mbps rcv_rtt:167 rcv_space:29348

what does skmem:(r0,rb369280,totb87040,f0,w0,o0,bl0) means ?

hi ,jamia:

can you help me to explan the output of ss -itm : ss -itm |grep -A1 "10.96.18.40:ftp" | tail -2 ESTAB 0 0 ::ffff:10.96.18.40:ftp ::ffff:10.96.168.229:62079
skmem:(r0,rb369280,t0,tb87040,f0,w0,o0,bl0) cubic wscale:0,7 rto:224 rtt:23.196/29.89 ato:40 mss:1460 cwnd:10 ssthresh:16 send 5.0Mbps rcv_rtt:167 rcv_space:29348

what does skmem:(r0,rb369280,totb87040,f0,w0,o0,bl0) means ?

I've covered this previously in our Customer Portal Discussions at: https://access.redhat.com/discussions/3624151