Unable to bind to a port even when lsof/netstat shows no usage

Solution Unverified - Updated -

Environment

  • Red Hat Enterprise Linux

Issue

  • Customer's application cannot bind to a port
  • lsof and netstat don't show the port is in usage
  • The port is not in the "bound but not listening" state as we'd usually expect with the above symptoms.
  • java.net.BindException: Address already in use

Resolution

Identify which is the process that is using that socket and kill it.
For RHEL5 or above:

#!/bin/bash
for i in $(ls /proc/ | egrep [0-9]+); do
find -H /proc/$i/fd/ -maxdepth 1 -xtype s 2>&- | xargs stat 2>&- \
    | grep socket: | tee /tmp/tst2 | awk '{print $4}' \
    | sed -e s/[^0-9]//g | \
    while read line; do 
        grep $line /proc/net/* >/dev/null 2>/dev/null;
        if [ "$(echo $?)" == "1" ]; then
            tail -n 1 /tmp/tst2 | tee -a /tmp/report;
        fi;
    done;
done;
if [ -e /tmp/report ]; then
    awk -F/ '{print $3}' /tmp/report |xargs ps -lp;
    rm -f /tmp/report;
else
    echo "No sockets found that are bound but not in use.";
fi;
rm -f /tmp/tst2;

For RHEL4:

#!/bin/bash
for i in $(ls /proc/ | egrep [0-9]+); do
find /proc/$i/fd/ -maxdepth 1 -xtype s 2>&- | xargs stat 2>&- \
    | grep socket: | tee /tmp/tst2 | awk '{print $4}' \
    | sed -e s/[^0-9]//g | \
    while read line; do 
        grep $line /proc/net/* >/dev/null 2>/dev/null;
        if [ "$(echo $?)" == "1" ]; then
            tail -n 1 /tmp/tst2 | tee -a /tmp/report;
        fi;
    done;
done;
if [ -e /tmp/report ]; then
    awk -F/ '{print $3}' /tmp/report |xargs ps -lp;
    rm -f /tmp/report;
else
    echo "No sockets found that are bound but not in use.";
fi;
rm -f /tmp/tst2;

The above script produces a report of processes that have a socket in this state but doesn't tell you what the port numbers are.

If crash is available (or can be loaded) the following command can be used, subsituting PORTNUMBER for the port number you are interested in.

echo "foreach net -s | grep -e PID -e \-PORTNUMBER" | crash -s > foo

The results are a list of PIDs running on the system and any sockets that are using that port number. Note that the port number may be a local port or remote port.

In the following example I am interested in port 5001. The process holding the port appears above the socket information. In this case the PID 4645 running the comand bind_only.

echo "foreach net -s | grep -e PID -e \-5001" | crash -s > foo
cat foo
PID: 0      TASK: ffffffff81a8d020  CPU: 0   COMMAND: "swapper"
PID: 0      TASK: ffff880079482040  CPU: 1   COMMAND: "swapper"
PID: 0      TASK: ffff88017f192aa0  CPU: 2   COMMAND: "swapper"
. . . . .
PID: 4602   TASK: ffff88017bd15500  CPU: 6   COMMAND: "bash"
PID: 4617   TASK: ffff880077605540  CPU: 0   COMMAND: "bash"
PID: 4632   TASK: ffff880179157500  CPU: 0   COMMAND: "bash"
PID: 4645   TASK: ffff880076dc4040  CPU: 0   COMMAND: "bind_only"
 3 ffff880037e2d400 ffff880036dcee40 INET:STREAM  0.0.0.0-5001 0.0.0.0-0
PID: 4646   TASK: ffff88017d438080  CPU: 6   COMMAND: "metacity"
PID: 11912  TASK: ffff88007704c080  CPU: 12  COMMAND: "vi"

Root Cause

Some applications are bind()ed, but not accepted() or listen()ed, so they don't appear on the list but the ports remain busy.

This can also happen if an outgoing connection request is refused or fails for some other reason and the socket is not closed.

Diagnostic Steps

  • A systemtap script was used to check for ports that were bound but had not been set to listen, and thus wouldn't show up in netstat/lsof. The customer did not see the port in question output by this script.

Attachments

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments