Red Hat Directory Server update 8.2.3-2.el5dsrv and PR_Write Netscape Portable Runtime error -5961 (TCP connection reset by peer.)

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux 5
  • Red Hat Directory Server 8, updated from 8.1 to 8.2
    redhat-ds-8.2.0-2.el5dsrv
    redhat-ds-base-8.2.5-1.el5dsrv
    

Issue

After an update of Red Hat Directory Server (RHDS) from 8.1 to 8.2, up to at least redhat-ds-base-8.2.5.1, the error logs like in file /var/log/dirsrv/slapd-ca1/errors may show occurences of the following entries:

[14/Jul/2011:18:00:24 -0400] - PR_Write(67) Netscape Portable Runtime error -5961 (TCP connection reset by peer.)

There are no indications that anything is broken nor any other problems, it just looks like the LDAP TCP connections are terminated early by one end.

Resolution

Those unexpected errors can be ignored.
A possible workaround is to use the logpipe script to exclude those messages that are filling up the log.
The is a fix upstream for 389_1.2.9.
The final fix is expected to be in RHDS 9.0 later in 2011.

Root Cause

Those NSPR errors at the network level happens because the ns-slapd instances may try to answer an UNBIND request as the other end just closes that TCP connection.
A function recently started to log more events than necessary for the NSPR call to PR_Write.
This is not an error, nor really a big problem for the function of the LDAP services, but the ns-slapd instances should not log those.
Although this may fill the logs more than usual, and make other troubleshooting more difficult, the NSPR write messages and code 5961 can be ignored in this case, they are just some noise.

Diagnostic Steps

-5961 is PR_CONNECT_RESET_ERROR which can occur if the system call returns either EPIPE or ECONNRESET.
U1 error:

      •   U1 = Connection closed by server after client sends an UNBIND request. The
          server will always close the connection when it sees an UNBIND request.

A simple search can show a similar behavior, example for a file descriptor number 64:

[25/Mar/2011:14:36:25 -0700] conn=5 fd=64 slot=64 connection from 10.14.54.219 to 10.14.7.221
[25/Mar/2011:14:36:25 -0700] conn=5 op=0 BIND dn="cn=directory manager" method=128 version=3
[25/Mar/2011:14:36:25 -0700] conn=5 op=0 RESULT err=0 tag=97 nentries=0 etime=0 dn="cn=directory manager"
[25/Mar/2011:14:36:25 -0700] conn=5 op=1 SRCH base="dc=testme" scope=2 filter="(objectClass=*)" attrs=ALL
[25/Mar/2011:14:36:30 -0700] conn=5 op=-1 fd=64 closed error 104 (Connection reset by peer) - TCP connection reset by peer.

[25/Mar/2011:14:36:30 -0700] - PR_Write(64) Netscape Portable Runtime error -5961 (TCP connection reset by peer.)

grep 104 /usr/include/asm-generic/errno.h
#define ECONNRESET      104     /* Connection reset by peer */

Example for a connection with file descriptor number 111, access and then error logs:

[21/Jul/2011:13:23:18 +0200] conn=7 fd=111 slot=111 SSL connection from 10.0.0.10 to 10.0.0.10
[21/Jul/2011:13:23:18 +0200] conn=7 SSL 256-bit AES
[21/Jul/2011:13:23:18 +0200] conn=7 op=0 SRCH base="ou=people,dc=example,dc=com" scope=2 filter="(objectClass=*)" attrs=ALL
[21/Jul/2011:13:23:18 +0200] conn=7 op=1 UNBIND
[21/Jul/2011:13:23:18 +0200] conn=7 op=1 fd=111 closed - U1

[21/Jul/2011:13:23:18 +0200] - PR_Write(64) Netscape Portable Runtime error -5961 (TCP connection reset by peer.)

Comments

See Red Hat Bugzilla number 712855

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments