Red Hat Training

A Red Hat training course is available for RHEL 8

Chapter 34. Getting started with Multipath TCP

Transmission Control Protocol (TCP) ensures reliable delivery of the data through the internet and automatically adjusts its bandwidth in response to network load. Multipath TCP (MPTCP) is an extension to the original TCP protocol (single-path). MPTCP enables a transport connection to operate across multiple paths simultaneously, and brings network connection redundancy to user endpoint devices.

34.1. Understanding MPTCP

The Multipath TCP (MPTCP) protocol allows for simultaneous usage of multiple paths between connection endpoints. The protocol design improves connection stability and also brings other benefits compared to the single-path TCP.

Note

In MPTCP terminology, links are considered as paths.

The following are some of the advantages of using MPTCP:

  • It allows a connection to simultaneously use multiple network interfaces.
  • In case a connection is bound to a link speed, the usage of multiple links can increase the connection throughput. Note, that in case of the connection is bound to a CPU, the usage of multiple links causes the connection slowdown.
  • It increases the resilience to link failures.

For more details about MPTCP, we highly recommend you review the Additional resources.

34.2. Preparing RHEL to enable MPTCP support

By default the MPTCP support is disabled in RHEL. Enable MPTCP so that applications that support this feature can use it. Additionally, you have to configure user space applications to force use MPTCP sockets if those applications have TCP sockets by default.

You can use the sysctl utility to enable MPTCP support and prepare RHEL for enabling MPTCP for applications system-wide using a SystemTap script.

Prerequisites

The following packages are installed:

  • systemtap
  • iperf3

Procedure

  1. Enable MPTCP sockets in the kernel:

    # echo "net.mptcp.enabled=1" > /etc/sysctl.d/90-enable-MPTCP.conf
    # sysctl -p /etc/sysctl.d/90-enable-MPTCP.conf
  2. Verify that MPTCP is enabled in the kernel:

    # sysctl -a | grep mptcp.enabled
    net.mptcp.enabled = 1
  3. Create a mptcp-app.stap file with the following content:

    #!/usr/bin/env stap
    
    %{
    #include <linux/in.h>
    #include <linux/ip.h>
    %}
    
    /* RSI contains 'type' and RDX contains 'protocol'.
     */
    
    function mptcpify () %{
        if (CONTEXT->kregs->si == SOCK_STREAM &&
            (CONTEXT->kregs->dx == IPPROTO_TCP ||
             CONTEXT->kregs->dx == 0)) {
                    CONTEXT->kregs->dx = IPPROTO_MPTCP;
                    STAP_RETVALUE = 1;
        } else {
               STAP_RETVALUE = 0;
        }
    %}
    
    probe kernel.function("__sys_socket") {
            if (mptcpify() == 1) {
                    printf("command %16s mptcpified\n", execname());
            }
    }
  4. Force user space applications to create MPTCP sockets instead of TCP ones:

    # stap -vg mptcp-app.stap

    Note: This operation affects all TCP sockets which are started after the command. The applications will continue using TCP sockets after you interrupt the command above with Ctrl+C.

  5. Alternatively, to allow MPTCP usage to only specific application, you can modify the mptcp-app.stap file with the following content:

    #!/usr/bin/env stap
    
    %{
    #include <linux/in.h>
    #include <linux/ip.h>
    %}
    
    /* according to [1], RSI contains 'type' and RDX
     * contains 'protocol'.
     * [1] https://github.com/torvalds/linux/blob/master/arch/x86/entry/entry_64.S#L79
     */
    
    function mptcpify () %{
    	if (CONTEXT->kregs->si == SOCK_STREAM &&
    	    (CONTEXT->kregs->dx == IPPROTO_TCP ||
    	     CONTEXT->kregs->dx == 0)) {
    		CONTEXT->kregs->dx = IPPROTO_MPTCP;
    		STAP_RETVALUE = 1;
    	} else {
    		STAP_RETVALUE = 0;
    	}
    %}
    
    probe kernel.function("__sys_socket") {
    	cur_proc = execname()
    	if ((cur_proc == @1) && (mptcpify() == 1)) {
    		printf("command %16s mptcpified\n", cur_proc);
    	}
    }
  6. In case of alternative choice, assuming, you want to force the iperf3 tool to use MPTCP instead of TCP. To do so, enter the following command:

    # stap -vg mptcp-app.stap iperf3
  7. After the mptcp-app.stap script installs the kernel probe, the following warnings appear in the kernel dmesg output

    # dmesg
    ...
    [ 1752.694072] Kprobes globally unoptimized
    [ 1752.730147] stap_1ade3b3356f3e68765322e26dec00c3d_1476: module_layout: kernel tainted.
    [ 1752.732162] Disabling lock debugging due to kernel taint
    [ 1752.733468] stap_1ade3b3356f3e68765322e26dec00c3d_1476: loading out-of-tree module taints kernel.
    [ 1752.737219] stap_1ade3b3356f3e68765322e26dec00c3d_1476: module verification failed: signature and/or required key missing - tainting kernel
    [ 1752.737219] stap_1ade3b3356f3e68765322e26dec00c3d_1476 (mptcp-app.stap): systemtap: 4.5/0.185, base: ffffffffc0550000, memory: 224data/32text/57ctx/65638net/367alloc kb, probes: 1
  8. Start the iperf3 server:

    # iperf3 -s
    
    Server listening on 5201
  9. Connect the client to the server:

    # iperf3 -c 127.0.0.1 -t 3
  10. After the connection is established, verify the ss output to see the subflow-specific status:

    # ss -nti '( dport :5201 )'
    
    State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
    ESTAB 0      0      127.0.0.1:41842    127.0.0.1:5201
    cubic wscale:7,7 rto:205 rtt:4.455/8.878 ato:40 mss:21888 pmtu:65535 rcvmss:536 advmss:65483 cwnd:10 bytes_sent:141 bytes_acked:142 bytes_received:4 segs_out:8 segs_in:7 data_segs_out:3 data_segs_in:3 send 393050505bps lastsnd:2813 lastrcv:2772 lastack:2772 pacing_rate 785946640bps delivery_rate 10944000000bps delivered:4 busy:41ms rcv_space:43690 rcv_ssthresh:43690 minrtt:0.008 tcp-ulp-mptcp flags:Mmec token:0000(id:0)/2ff053ec(id:0) seq:3e2cbea12d7673d4 sfseq:3 ssnoff:ad3d00f4 maplen:2
  11. Verify MPTCP counters:

    # nstat MPTcp*
    
    #kernel
    MPTcpExtMPCapableSYNRX          2                  0.0
    MPTcpExtMPCapableSYNTX          2                  0.0
    MPTcpExtMPCapableSYNACKRX       2                  0.0
    MPTcpExtMPCapableACKRX          2                  0.0

Additional resources

34.3. Using iproute2 to temporarily configure and enable multiple paths for MPTCP applications

Each MPTCP connection uses a single subflow similar to plain TCP. To get the MPTCP benefits, specify a higher limit for maximum number of subflows for each MPTCP connection. Then configure additional endpoints to create those subflows.

Important

The configuration in this procedure will not persist after rebooting your machine.

Note that MPTCP does not yet support mixed IPv6 and IPv4 endpoints for the same socket. Use endpoints belonging to the same address family.

Prerequisites

  • The iperf3 package is installed
  • Server network interface settings:

    • enp4s0: 192.0.2.1/24
    • enp1s0: 198.51.100.1/24
  • Client network interface settings:

    • enp4s0f0: 192.0.2.2/24
    • enp4s0f1: 198.51.100.2/24

Procedure

  1. Configure the client to accept up to 1 additional remote address, as provided by the server:

    # ip mptcp limits set add_addr_accepted 1
  2. Add IP address 198.51.100.1 as a new MPTCP endpoint on the server:

    # ip mptcp endpoint add 198.51.100.1 dev enp1s0 signal

    The signal option ensures that the ADD_ADDR packet is sent after the three-way-handshake.

  3. Start the iperf3 server:

    # iperf3 -s
    
    Server listening on 5201
  4. Connect the client to the server:

    # iperf3 -c 192.0.2.1 -t 3

Verification

  1. Verify the connection is established:

    # ss -nti '( sport :5201 )'
  2. Verify the connection and IP address limit:

    # ip mptcp limit show
  3. Verify the newly added endpoint:

    # ip mptcp endpoint show
  4. Verify MPTCP counters by using the nstat MPTcp* command on a server:

    # nstat MPTcp*
    
    #kernel
    MPTcpExtMPCapableSYNRX          2                  0.0
    MPTcpExtMPCapableACKRX          2                  0.0
    MPTcpExtMPJoinSynRx             2                  0.0
    MPTcpExtMPJoinAckRx             2                  0.0
    MPTcpExtEchoAdd                 2                  0.0

Additional resources

  • ip-mptcp(8) man page
  • mptcpize(8) man page

34.4. Permanently configuring multiple paths for MPTCP applications

You can configure MultiPath TCP (MPTCP) using the nmcli command to permanently establish multiple subflows between a source and destination system. The subflows can use different resources, different routes to the destination, and even different networks. Such as Ethernet, cellular, wifi, and so on. As a result, you achieve combined connections, which increase network resilience and throughput.

The server uses the following network interfaces in our example:

  • enp4s0: 192.0.2.1/24
  • enp1s0: 198.51.100.1/24
  • enp7s0: 192.0.2.3/24

The client uses the following network interfaces in our example:

  • enp4s0f0: 192.0.2.2/24
  • enp4s0f1: 198.51.100.2/24
  • enp6s0: 192.0.2.5/24

Prerequisites

  • You configured the default gateway on the relevant interfaces.

Procedure

  1. Enable MPTCP sockets in the kernel:

    # echo "net.mptcp.enabled=1" > /etc/sysctl.d/90-enable-MPTCP.conf
    # sysctl -p /etc/sysctl.d/90-enable-MPTCP.conf
  2. Optional: The RHEL kernel default for subflow limit is 2. If you require more:

    1. Create the /etc/systemd/system/set_mptcp_limit.service file with the following content:

      [Unit]
      Description=Set MPTCP subflow limit to 3
      After=network.target
      
      [Service]
      ExecStart=ip mptcp limits set subflows 3
      Type=oneshot
      
      [Install]
      WantedBy=multi-user.target

      The oneshot unit executes the ip mptcp limits set subflows 3 command after your network (network.target) is operational during every boot process.

      The ip mptcp limits set subflows 3 command sets the maximum number of additional subflows for each connection, so 4 in total. It is possible to add maximally 3 additional subflows.

    2. Enable the set_mptcp_limit service:

      # systemctl enable --now set_mptcp_limit
  3. Enable MPTCP on all connection profiles that you want to use for connection aggregation:

    # nmcli connection modify <profile_name> connection.mptcp-flags signal,subflow,also-without-default-route

    The connection.mptcp-flags parameter configures MPTCP endpoints and the IP address flags. If MPTCP is enabled in a NetworkManager connection profile, the setting will configure the IP addresses of the relevant network interface as MPTCP endpoints.

    By default, NetworkManager does not add MPTCP flags to IP addresses if there is no default gateway. If you want to bypass that check, you need to use also the also-without-default-route flag.

Verification

  1. Verify that you enabled the MPTCP kernel parameter:

    # sysctl net.mptcp.enabled
    net.mptcp.enabled = 1
  2. Verify that you set the subflow limit correctly, in case the default was not enough:

    # ip mptcp limit show
    add_addr_accepted 2 subflows 3
  3. Verify that you configured the per-address MPTCP setting correctly:

    # ip mptcp endpoint show
    192.0.2.1 id 1 subflow dev enp4s0
    198.51.100.1 id 2 subflow dev enp1s0
    192.0.2.3 id 3 subflow dev enp7s0
    192.0.2.4 id 4 subflow dev enp3s0
    ...

34.5. Monitoring MPTCP sub-flows

The life cycle of a multipath TCP (MPTCP) socket can be complex: The main MPTCP socket is created, the MPTCP path is validated, one or more sub-flows are created and eventually removed. Finally, the MPTCP socket is terminated.

The MPTCP protocol allows monitoring MPTCP-specific events related to socket and sub-flow creation and deletion, using the ip utility provided by the iproute package. This utility uses the netlink interface to monitor MPTCP events.

This procedure demonstrates how to monitor MPTCP events. For that, it simulates a MPTCP server application, and a client connects to this service. The involved clients in this example use the following interfaces and IP addresses:

  • Server: 192.0.2.1
  • Client (Ethernet connection): 192.0.2.2
  • Client (WiFi connection): 192.0.2.3

To simplify this example, all interfaces are within the same subnet. This is not a requirement. However, it is important that routing has been configured correctly, and the client can reach the server via both interfaces.

Prerequisites

  • A RHEL client with two network interfaces, such as a laptop with Ethernet and WiFi
  • The client can connect to the server via both interfaces
  • A RHEL server
  • Both the client and the server run RHEL 8.6 or later

Procedure

  1. Set the per connection additional subflow limits to 1 on both client and server:

    # ip mptcp limits set add_addr_accepted 0 subflows 1
  2. On the server, to simulate a MPTCP server application, start netcat (nc) in listen mode with enforced MPTCP sockets instead of TCP sockets:

    # nc -l -k -p 12345

    The -k option causes that nc does not close the listener after the first accepted connection. This is required to demonstrate the monitoring of sub-flows.

  3. On the client:

    1. Identify the interface with the lowest metric:

      # ip -4 route
      192.0.2.0/24 dev enp1s0 proto kernel scope link src 192.0.2.2 metric 100
      192.0.2.0/24 dev wlp1s0 proto kernel scope link src 192.0.2.3 metric 600

      The enp1s0 interface has a lower metric than wlp1s0. Therefore, RHEL uses enp1s0 by default.

    2. On the first terminal, start the monitoring:

      # ip mptcp monitor
    3. On the second terminal, start a MPTCP connection to the server:

      # nc 192.0.2.1 12345

      RHEL uses the enp1s0 interface and its associated IP address as a source for this connection.

      On the monitoring terminal, the ip mptcp monitor command now logs:

      [       CREATED] token=63c070d2 remid=0 locid=0 saddr4=192.0.2.2 daddr4=192.0.2.1 sport=36444 dport=12345

      The token identifies the MPTCP socket as an unique ID, and later it enables you to correlate MPTCP events on the same socket.

    4. On the terminal with the running nc connection to the server, press Enter. This first data packet fully establishes the connection. Note that, as long as no data has been sent, the connection is not established.

      On the monitoring terminal, ip mptcp monitor now logs:

      [   ESTABLISHED] token=63c070d2 remid=0 locid=0 saddr4=192.0.2.2 daddr4=192.0.2.1 sport=36444 dport=12345
    5. Optional: Display the connections to port 12345 on the server:

      # ss -taunp | grep ":12345"
      tcp ESTAB  0  0         192.0.2.2:36444 192.0.2.1:12345

      At this point, only one connection to the server has been established.

    6. On a third terminal, create another endpoint:

      # ip mptcp endpoint add dev wlp1s0 192.0.2.3 subflow

      This command sets the name and IP address of the WiFi interface of the client in this command.

      On the monitoring terminal, ip mptcp monitor now logs:

      [SF_ESTABLISHED] token=63c070d2 remid=0 locid=2 saddr4=192.0.2.3 daddr4=192.0.2.1 sport=53345 dport=12345 backup=0 ifindex=3

      The locid field displays the local address ID of the new sub-flow and identifies this sub-flow even if the connection uses network address translation (NAT). The saddr4 field matches the endpoint’s IP address from the ip mptcp endpoint add command.

    7. Optional: Display the connections to port 12345 on the server:

      # ss -taunp | grep ":12345"
      tcp ESTAB  0  0         192.0.2.2:36444 192.0.2.1:12345
      tcp ESTAB  0  0  192.0.2.3%wlp1s0:53345 192.0.2.1:12345

      The command now displays two connections:

      • The connection with source address 192.0.2.2 corresponds to the first MPTCP sub-flow that you established previously.
      • The connection from the sub-flow over the wlp1s0 interface with source address 192.0.2.3.
    8. On the third terminal, delete the endpoint:

      # ip mptcp endpoint delete id 2

      Use the ID from the locid field from the ip mptcp monitor output, or retrieve the endpoint ID using the ip mptcp endpoint show command.

      On the monitoring terminal, ip mptcp monitor now logs:

      [     SF_CLOSED] token=63c070d2 remid=0 locid=2 saddr4=192.0.2.3 daddr4=192.0.2.1 sport=53345 dport=12345 backup=0 ifindex=3
    9. On the first terminal with the nc client, press Ctrl+C to terminate the session.

      On the monitoring terminal, ip mptcp monitor now logs:

      [        CLOSED] token=63c070d2

Additional resources

34.6. Disabling Multipath TCP in the kernel

You can explicitly disable the MPTCP option in the kernel.

Procedure

  • Disable the mptcp.enabled option.

    # echo "net.mptcp.enabled=0" > /etc/sysctl.d/90-enable-MPTCP.conf
    # sysctl -p /etc/sysctl.d/90-enable-MPTCP.conf

Verification

  • Verify whether the mptcp.enabled is disabled in the kernel.

    # sysctl -a | grep mptcp.enabled
    net.mptcp.enabled = 0