kernel: Possible SYN flooding on port #. Sending cookies.
Environment
- Red Hat Enterprise Linux (RHEL)
- TCP network connections
Issue
-
One of the following messages is logged:
kernel: possible SYN flooding on port X. kernel: possible SYN flooding on port X. Sending cookies. kernel: Possible SYN flooding on port X. Check SNMP counters. kernel: Possible SYN flooding on port X. Sending cookies. Check SNMP counters. kernel: TCPv6: Possible SYN flooding on port X.
-
Our system is sending SYN cookies.
- Client application has high load with many rapid TCP connections, which appears to SYN flood the server.
- What tunables in the kernel can help guard against or make a system resistant to SYN-FLOOD attacks?
- In
netstat -s
I seex times the listen queue of a socket overflowed
orSYNs to LISTEN sockets dropped
growing - The
ListenOverflows
orListenDrops
value of/proc/net/netstat
is increasing - Kernel dropping TCP connections due to LISTEN sockets buffer full in Red Hat Enterprise Linux
- During peak periods, RHEL server would drop TCP SYN packets due to the kernel's buffer of LISTEN sockets being full and overflowing
Resolution
If required, refer to the below Root Cause section to obtain an understanding of TCP SYN, TCP handshake, listening sockets, SYN flood, and SYN cookies. An understanding of these terms is recommended before investigating or implementing any changes.
Table of Contents
- Determine whether the traffic is valid or malicious
- If the traffic is malicious
- If the traffic is valid
Determine whether the traffic is valid or malicious
Use application debugging, network monitoring tools, or work with your network team or service provider.
This requires an understanding of:
- the application's expected workload
- expected client IP addresses
- and expected client behaviour
Use the netstat
or ss
commands to inspect TCP socket states as follows, where X
is the port number reported in the Possible SYN flooding on port X
message:
netstat -nta | egrep "State|X"
ss -nta '( dport = :X )'
Having many sockets in the SYN-RECV
state could mean a malicious "SYN flood" attack, though this is not the only type of malicious attack. You may also wish to inspect the source IP addresses of traffic to the port in question to confirm if client IPs are expected or unexpected.
The SystemTap script at Where are TCP SYNs coming from? can be used to monitor valid incoming SYNs to sockets in LISTEN
state, even SYNs which are later rejected as SYN Flood or with SYN Cookies.
Use the tcpdump
command to capture network traffic. Use the Packet Capture Syntax Generator to generate meaningful command options. Also refer to How to capture network packets with tcpdump? or the manual page man tcpdump
.
A packet capture which displays many SYNs, which the server responds to with SYN+ACK, but the client never replies with the final ACK could mean a traditional "SYN flood" attack, though this is not the only type of malicious attack.
It is up to you to determine whether incoming traffic is valid or not. Red Hat have no knowledge of your application traffic or environment or expected client addresses. Red Hat can help to use these commands to extract meaningful results, but the decision whether traffic is valid or malicious is up to you and your business to make.
If the traffic is malicious
Work with your network team or service provider to block the traffic before it reaches the listening system or your network.
You may also use the iptables firewall to block traffic using the limit or hashlimit or connlimit match extensions. For full syntax and examples see:
- RHEL 7:
man iptables-extensions
- RHEL 5 and 6:
man iptables
Note that Red Hat are able to assist with usage of the iptables
commands but are not able to write firewall rules to resolve malicious attacks against customers. Such an action would be development and implementation of security policy which is outside the Production Support Scope of Coverage.
If the traffic is valid
Confirm the application is accepting new connections
Confirm the application is actually making the accept()
system call to move new connections out of the socket backlog.
Use the strace
command as described at How do I use strace to trace system calls made by a command? or use application-specific debugging.
If the application is not calling accept()
at all, or is calling it slower than expected, then debug the application to determine why it is not accepting new connections fast enough.
If the application is accepting new connections
If you confirm that application is accepting new connections and the rate of valid traffic is too high for the application, then two changes must be made to allow this listening application to cope with the workload.
These changes are:
- the kernel's socket backlog limit
- the application's socket listen backlog
Both of these must be changed. There is no point changing one but not the other.
Increase kernel socket backlog limit
The kernel's socket backlog limit is controlled by the net.core.somaxconn
kernel tunable.
View the current value of the tunable with the command:
# sysctl net.core.somaxconn
net.core.somaxconn = 128
Increase the value with a command such as:
# sysctl -w net.core.somaxconn=2048
net.core.somaxconn = 2048
Confirm the change by viewing again:
# sysctl net.core.somaxconn
net.core.somaxconn = 2048
Persist this change across reboots by entering the corresponding line into /etc/sysctl.conf
:
# echo "net.core.somaxconn = 2048" >> /etc/sysctl.conf
Note that your value may not be 2048
, it may be smaller or larger.
After changing this tunable, restart the application for the changes to take effect at the next listen()
call.
Note: On some systems, it may also be necessary to change the limit of currently-handshaking (SYN-RECV
and waiting for ACK
) connections on a socket with the kernel tunable:
# sysctl net.ipv4.tcp_max_syn_backlog
net.ipv4.tcp_max_syn_backlog = 512
Increase with sysctl -w
and persist across reboots with /etc/sysctl.conf
as per the examples above.
After changing this tunable, restart the application for the changes to take effect at the next listen()
call.
Increase application socket listen backlog
The application's socket listen backlog is applied when the application makes the listen()
system call against its socket.
This example in the C language shows the change from a small listen backlog to a larger backlog:
- rc = listen(sockfd, 128); /* old line */
+ rc = listen(sockfd, 2048); /* new line */
if (rc < 0)
{
perror("listen() failed");
close(sockfd);
exit(-1);
}
The manual page for man listen
shows the syntax in C:
int listen(int sockfd, int backlog);
Other programming languages may implement the listen backlog with a different syntax.
An application may even make the listen backlog a configurable value.
After changing the application listen backlog, recompile (if written in a compiled language) or restart (if written in an interpreted language or if configuration is changed) the application for the change to apply.
In the event an application has a hard-coded listen backlog which cannot be applied, an unsupported method to override the listen()
function is described at How can I increase the TCP listen backlog value of a socket when the application has a hardcoded value?.
Some known programming language constructs and configuration options are listed below.
Java
In the Java ServerSocket object the syntax is:
ServerSocket(int port, int backlog)
Python
In Python 2 socket library and Python 3 socket library implement the .listen()
method on a socket object like:
Socket.listen(backlog)
Apache
In the Apache Web Server, the listen backlog can be configured in the httpd.conf
configuration file:
ListenBacklog 512
nginx
In nginx, the listen backlog can be configured as part of the listen
directive:
listen 80 backlog=512;
named
In named, the listen backlog can be configured using the tcp-listen-queue
directive, which is 10
by default:
tcp-listen-queue 512;
Squid
In the Squid proxy, the listen backlog can be configured in squid.conf
with:
max_filedescriptors 512
If squid is compiled with USE_SELECT
, the maximum value for this option is 1024
. If the value is not compatible, Squid will log the error WARNING: 'max_filedescriptors X' does not work with select()
when the service is started.
OpenSSH (sshd)
The OpenSSH server listen backlog is hard-coded to 128
and cannot be changed:
If you believe you have a SSH accept performance issue, please open a support case with Red Hat for investigation.
Samba (smbd)
The Samba listen backlog is hard-coded to 50
and cannot be changed:
If you believe you have a Samba accept performance issue, please open a support case with Red Hat for investigation.
Confirm the change in application behaviour
This can be done multiple ways.
1) Run the application as normal. Once the application has started, view the backlog value under the Send-Q
in ss -ntlp
output.
The following example shows the listen backlog in use is 10
:
# ss -ntlp | more
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 10 *:9001 *:* users:(("nc",pid=1234,fd=3))
^^
value is 10
2) Run the application under strace system call tracer and observe the values passed to the listen()
system call.
The following example shows the listen backlog in use is 10
:
# strace -fvttTyyx -s 4096 -e socket,bind,listen nc -n4l 9001
socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 3<TCP:[502295]>
bind(3<TCP:[502295]>, {sa_family=AF_INET, sin_port=htons(9001), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
listen(3<TCP:[502295]>, 10) = 0
^^
value is 10
The strace method is normally only useful when you can capture the application startup. Applications usually open their listening socket when the application starts, so it is often not useful to attach strace to an already-running process.
Root Cause
Overview
This section describes how TCP connections work and what happens in the lead up to a Possible SYN flooding
message being logged.
The TCP State Diagram may be useful to understand this information.
Diagram provided by Wikimedia Commons under Creative Commons, Attribution, Share Alike license.
What is a TCP SYN and TCP Handshake?
These items are covered on the knowledgebase at:
How do listening sockets work?
When an application opens a socket, it can connect out to another system (sending a SYN), or it can listen for new connections coming in to this system (sending SYN+ACK when a SYN comes in).
When an application listens, it must accept new connections as they appear. Once a connection is accepted, a new socket is open and data can move back and forth.
When an application chooses to listen, it must provide a backlog value, which determines how many un-accepted connections can sit waiting for the application to accept them.
A connection in the socket backlog will still perform a TCP handshake.
What is a SYN Cookie?
SYN cookies are a method by which TCP connections can continue to be established when a socket's listen backlog fills up.
SYN cookies allow connections to continue establishing at times when a socket faces a temporary SYN flood, or when the application does not accept new connections fast enough or at all.
If the system's valid workload is such that SYN cookies are being logged regularly, the system and application should be tuned to avoid them.
A SYN cookie is created by crafting a special SYN+ACK where the TCP Sequence Number is a function of the time, the Maximum Segment Size, and the client and server's IP address and port numbers.
SYN cookies are not part of any RFC, though they do conform to the TCP standard. A full description of the calculation to create a cookie is given at some external sources:
SYN cookies are sent because the functionality is compiled into the RHEL kernel, and enabled by default. SYN cookies are controlled by the kernel tunable:
# sysctl net.ipv4.tcp_syncookies
net.ipv4.tcp_syncookies = 1
Linux kernel SYN Cookies support a limited number of TCP Options. Only the Timestamp, Window Scale, SACK, and ECN options are supported. Other TCP options will not be negotiated.
- Note If this tunable is set to disable the sending of SYN cookies, the SYN must still be dropped. Doing so will not improve system performance, nor the amount of logging. The log message will change from Sending cookies to Dropping request.
Diagnostic Steps
The following description of the kernel tunables mentioned in the Resolution and Root Cause sections is provided in the kernel-doc
package file Documentation/networking/ip-sysctl.txt
:
tcp_syncookies - BOOLEAN
Only valid when the kernel was compiled with CONFIG_SYNCOOKIES
Send out syncookies when the syn backlog queue of a socket
overflows. This is to prevent against the common 'SYN flood attack'
Default: FALSE
Note, that syncookies is fallback facility.
It MUST NOT be used to help highly loaded servers to stand
against legal connection rate. If you see SYN flood warnings
in your logs, but investigation shows that they occur
because of overload with legal connections, you should tune
another parameters until this warning disappear.
See: tcp_max_syn_backlog, tcp_synack_retries, tcp_abort_on_overflow.
syncookies seriously violate TCP protocol, do not allow
to use TCP extensions, can result in serious degradation
of some services (f.e. SMTP relaying), visible not by you,
but your clients and relays, contacting you. While you see
SYN flood warnings in logs not being really flooded, your server
is seriously misconfigured.
somaxconn - INTEGER
Limit of socket listen() backlog, known in userspace as SOMAXCONN.
Defaults to 128. See also tcp_max_syn_backlog for additional tuning
for TCP sockets.
tcp_max_syn_backlog - INTEGER
Maximal number of remembered connection requests, which have not
received an acknowledgment from connecting client.
The minimal value is 128 for low memory machines, and it will
increase in proportion to the memory of machine.
If server suffers from overload, try increasing this number.
The description of the listen()
system call is given in man 2 listen
and man 3p listen
:
LISTEN(2) Linux Programmer’s Manual LISTEN(2)
NAME
listen - listen for connections on a socket
SYNOPSIS
#include <sys/types.h> /* See NOTES */
#include <sys/socket.h>
int listen(int sockfd, int backlog);
DESCRIPTION
listen() marks the socket referred to by sockfd as a passive socket,
that is, as a socket that will be used to accept incoming connection
requests using accept(2).
The sockfd argument is a file descriptor that refers to a socket of
type SOCK_STREAM or SOCK_SEQPACKET.
The backlog argument defines the maximum length to which the queue of
pending connections for sockfd may grow. If a connection request
arrives when the queue is full, the client may receive an error with
an indication of ECONNREFUSED or, if the underlying protocol supports
retransmission, the request may be ignored so that a later reattempt at
connection succeeds.
RETURN VALUE
On success, zero is returned. On error, -1 is returned, and errno is
set appropriately.
Specific line numbers here are from RHEL 6.5 kernel 2.6.32-431.1.1.el6.
The message reporting SYN cookies are being sent is generated at:
net/ipv4/tcp_ipv4.c
790 #ifdef CONFIG_SYN_COOKIES
791 static void syn_flood_warning(struct sk_buff *skb)
792 {
793 static unsigned long warntime;
794
795 if (time_after(jiffies, (warntime + HZ * 60))) {
796 warntime = jiffies;
797 printk(KERN_INFO
798 "possible SYN flooding on port %d. Sending cookies.\n",
799 ntohs(tcp_hdr(skb)->dest));
800 }
801 }
802 #endif
The time_after
block just prints the message if it has not already printed within the last 60 seconds.
The block calling syn_flood_warning
is:
net/ipv4/tcp_ipv4.c
1213 int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
1214 {
1215 struct inet_request_sock *ireq;
1216 struct tcp_options_received tmp_opt;
1217 struct request_sock *req;
1218 __be32 saddr = ip_hdr(skb)->saddr;
1219 __be32 daddr = ip_hdr(skb)->daddr;
1220 __u32 isn = TCP_SKB_CB(skb)->when;
1221 struct dst_entry *dst = NULL;
1222 #ifdef CONFIG_SYN_COOKIES
1223 int want_cookie = 0;
1224 #else
1225 #define want_cookie 0 /* Argh, why doesn't gcc optimize this :( */
1226 #endif
1227
1228 /* Never answer to SYNs send to broadcast or multicast */
1229 if (skb_rtable(skb)->rt_flags & (RTCF_BROADCAST | RTCF_MULTICAST))
1230 goto drop;
1231
1232 /* TW buckets are converted to open requests without
1233 * limitations, they conserve resources and peer is
1234 * evidently real one.
1235 */
1236 if (inet_csk_reqsk_queue_is_full(sk) && !isn) {
1237 #ifdef CONFIG_SYN_COOKIES
1238 if (sysctl_tcp_syncookies) {
1239 want_cookie = 1;
1240 } else
1241 #endif
1242 goto drop;
1243 }
...
1286 if (want_cookie) {
1287 #ifdef CONFIG_SYN_COOKIES
1288 syn_flood_warning(skb);
1289 req->cookie_ts = tmp_opt.tstamp_ok;
1290 #endif
1291 isn = cookie_v4_init_sequence(sk, skb, &req->mss);
Meaning if SYN cookies are enabled (both compiled in and turned on) then we log the fact that we're sending them.
The check for whether or not to send cookies is inet_csk_reqsk_queue_is_full
which can be traced as follows:
include/net/inet_connection_sock.h
290 static inline int inet_csk_reqsk_queue_is_full(const struct sock *sk)
291 {
292 return reqsk_queue_is_full(&inet_csk(sk)->icsk_accept_queue);
293 }
Where we check if a queue is full by performing an arithmetic right shift on the socket's queue length by max_qlen_log
:
include/net/request_sock.h
228 static inline int reqsk_queue_is_full(const struct request_sock_queue *queue)
229 {
230 return queue->listen_opt->qlen >> queue->listen_opt->max_qlen_log;
231 }
max_qlen_log
is given an upper bound by sysctl_max_syn_backlog
(the net.ipv4.tcp_max_syn_backlog
kernel tunable):
net/core/request_sock.c
38 int reqsk_queue_alloc(struct request_sock_queue *queue,
39 unsigned int nr_table_entries)
40 {
...
44 nr_table_entries = min_t(u32, nr_table_entries, sysctl_max_syn_backlog);
45 nr_table_entries = max_t(u32, nr_table_entries, 8);
46 nr_table_entries = roundup_pow_of_two(nr_table_entries + 1);
...
57 for (lopt->max_qlen_log = 3;
58 (1 << lopt->max_qlen_log) < nr_table_entries;
59 lopt->max_qlen_log++);
We create this limit when we start a listening socket:
net/ipv4/inet_connection_sock.c
683 int inet_csk_listen_start(struct sock *sk, const int nr_table_entries)
...
687 int rc = reqsk_queue_alloc(&icsk->icsk_accept_queue, nr_table_entries);
Where the limit comes from the backlog
parameter:
net/ipv4/af_inet.c
188 /*
189 * Move a socket into listening state.
190 */
191 int inet_listen(struct socket *sock, int backlog)
...
211 err = inet_csk_listen_start(sk, backlog);
Which is passed from userspace using the listen()
system call:
net/socket.c
1464 /*
1465 * Perform a listen. Basically, we allow the protocol to do anything
1466 * necessary for a listen, and if that works, we mark the socket as
1467 * ready for listening.
1468 */
1469
1470 SYSCALL_DEFINE2(listen, int, fd, int, backlog)
1471 {
1472 struct socket *sock;
1473 int err, fput_needed;
1474 int somaxconn;
1475
1476 sock = sockfd_lookup_light(fd, &err, &fput_needed);
1477 if (sock) {
1478 somaxconn = sock_net(sock->sk)->core.sysctl_somaxconn;
1479 if ((unsigned)backlog > somaxconn)
1480 backlog = somaxconn;
1481
1482 err = security_socket_listen(sock, backlog);
1483 if (!err)
1484 err = sock->ops->listen(sock, backlog);
1485
1486 fput_light(sock->file, fput_needed);
1487 }
1488 return err;
1489 }
The backlog is given a max bound by sysctl_somaxconn
(the net.core.somaxconn
kernel tunable).
So if a socket's listen queue is full, and more SYNs arrive for that socket, then we either send SYN cookies, or if SYN cookies are disabled then we drop the incoming traffic.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments