TCP tuning for high bandwidth delay product network

Latest response

We are trying to copy some files over a very high latency connection of about 325ms round-trip. Using UDP, I'm able to get throughput speeds of about 80Mb/s, but using various TCP-based methods such as SCP or http, I am only getting about 25Mb/s. In addition to our "production" environment, we have also used a network emulator in a lab environment to replicate the bandwidth and latency, and we are getting the same results. After doing some web searching, I tried doing the following things:

  • Sending system:
    Net.core.wmem_default set to 8000000 (2x BDP)
    Net.core.wmem_max set to 16000000
    Net.ipv4.tcp_wmem set to 4096 8000000 16000000
    Rebooted, verified parameters
    Ethernet interface ring buffer sizes increased from 512 default to
    4096 maximum
    increased txqueuelen to 10000

• Receiving system:
Net.core.rmem_default set to 8000000
Net.core.rmem_max set to 16000000
Net.ipv4.tcp_rmem set to 4096 8000000 16000000
Rebooted, verified parameters
Ethernet interface ring buffer set to 4096 from 512

Not only did this not fix the problem, we aren't seeing any appreciable difference in throughput. One thing we have tried is multi-threading, which gave us some moderate improvement, but I'm wondering if anyone has had success with tuning for a high latency connection, and if so, what are we missing? Thanks for your help!

Responses

What's the bandwidth? 325ms RTT with BDP calculated at 4 MiB implies 100 Mbps.

I have found that larger than 2x BDP is usually required for best throughput, try bumping default up to 16 MiB and max up to 32 MiB.

Don't touch tcp_mem. That's measured in pages and is for entire TCP, not for an individual socket. Set it back to whatever udp_mem calculated as default based on RAM size at boot.

You've already mostly followed the advice in How do I tune RHEL for better TCP performance over a specific network connection? but that net.ipv4.tcp_slow_start_after_idle = 0 is worth a try too.

Don't test with SCP, it will bottleneck on CPU-bound encryption: Slow SSH or SCP transfer performance

I would start with just a basic network test like iperf3 before moving onto more complex things like disk-backed transfers like HTTP or FTP or NFS.

UDP doesn't have window moderation, so you can either transmit a little bit reliably, or a lot unreliably. You'll never get a UDP transfer which is both lossless and fast. Use TCP for this testing.