How to enable valgrind for Red Hat Directory Server version 10?
Environment
Red Hat Directory Server version 10
Red Hat Enterprise Linux version 7
Issue
Show by example how to run the tool called valgrind to detect memory leaks in Red Hat Directory Sever application version 10 on Red Hat Enterprise Linux version 7.
Resolution
IMPORTANT WARNING:
Runing the LDAP service under valgrind has a severe drawback, depending on the hardware used and LDAP traffic or verbose logging set in ns-slapd, the performance will likely be very bad, the LDAP service may become extremely slow to respond, or appear to not respond at all, be aware.
Adding RAM may help a little bit.
If it is acceptable to install the valgrind tool, and run ns-slapd from valgrind knowing the possible effects, then continue with this article for more investigation to locate eventual ns-slapd memory leaks:
Install the 389-ds-base debug info and glibc packages on a test system:
debuginfo-install 389-ds-base glibc
yum install -y valgrind
Stop the LDAP service, and verify it is stopped:
systemctl stop dirsrv.target
lsof -i :389
Keep a copy of the systemd dirsrv target for an installed instance, example with m1, replace the string "m1" with what is used in the system's environment:
cp -p /etc/systemd/system/dirsrv.target.wants/dirsrv\@m1.service ~/etc.systemd.system.dirsrv.target.wants.dirsrv\@m1.service.orig
We want to run ns-slapd under valgrind with those options:
ExecStart=/bin/valgrind -v --tool=memcheck --leak-check=full --leak-resolution=high --num-callers=50 --trace-children=yes --show-reachable=yes --track-origins=yes --read-var-info=yes --log-file=/tmp/valgrind.%p.out /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-%i -i /var/run/dirsrv/slapd-%i.pid
Modify the systemd dirsrv target for an installed instance, example with m1, replace the string "m1" with what is used in the system's environment:
sed -i 's/^\(ExecStart=.*$\)/#\1\nExecStart=\/usr\/bin\/valgrind -v --tool=memcheck --leak-check=full --leak-resolution=high --num-callers=50 --trace-children=yes --show-reachable=yes --track-origins=yes --read-var-info=yes --log-file=\/var\/tmp\/valgrind.%p.out \/usr\/sbin\/ns-slapd -D \/etc\/dirsrv\/slapd-%i -i \/var\/run\/dirsrv\/slapd-%i.pid/' /etc/systemd/system/dirsrv.target.wants/dirsrv\@m1.service
Restart systemd:
systemctl daemon-reload
Optionnal, show system messages:
tail -f /var/log/messages &
Restart the Red Hat Directory Server:
systemctl restart dirsrv.target
Provide the valgrind output file for review when there are signs of memory leak, excessive size, or a out os memory situation.
Root Cause
Need to investigate LDAP service's unusual process size in memory, here with the ns-slapd binary provided by the Red Hat Directory Server application.
This applies when for example, the ns-slapd process is selected by the kernel as a Out Of Memory / OOM candidate, and terminates this application.
The valgrind tool can show eventual memory leaks and detect portions of source code provoking them.
Diagnostic Steps
Output example when modifying the systemd dirsrv target for an installed instance, example with m1, replace the string "m1" with what is used in the system's environment:
diff -u ~/etc.systemd.system.dirsrv.target.wants.dirsrv\@m1.service.orig /etc/systemd/system/dirsrv.target.wants/dirsrv\@m1.service
--- /root/etc.systemd.system.dirsrv.target.wants.dirsrv@m1.service.orig 2018-09-11 16:12:12.588000000 +0000
+++ /etc/systemd/system/dirsrv.target.wants/dirsrv@m1.service 2018-09-11 16:14:13.323000000 +0000
@@ -26,7 +26,8 @@
EnvironmentFile=/etc/sysconfig/dirsrv-%i
PIDFile=/var/run/dirsrv/slapd-%i.pid
ExecStartPre=/usr/sbin/ds_systemd_ask_password_acl /etc/dirsrv/slapd-%i/dse.ldif
-ExecStart=/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-%i -i /var/run/dirsrv/slapd-%i.pid
+#ExecStart=/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-%i -i /var/run/dirsrv/slapd-%i.pid
+ExecStart=/usr/bin/valgrind -v --tool=memcheck --leak-check=full --leak-resolution=high --num-callers=50 --trace-children=yes --show-reachable=yes --track-origins=yes --read-var-info=yes --log-file=/var/tmp/valgrind.%p.out /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-%i -i /var/run/dirsrv/slapd-%i.pid
# if you need to set other directives e.g. LimitNOFILE=8192
# set them in this file
.include /etc/sysconfig/dirsrv.systemd
Output example with system messages when starting the LDAP service, note valgrind has ns-slaps under its control:
Sep 11 16:15:30 m1 systemd: Starting 389 Directory Server m1....
Sep 11 16:15:41 m1 valgrind: [11/Sep/2018:16:15:41.732954213 +0000] - NOTICE - slapd_bootstrap_config - nsslapd-errorlog-level: ignoring 16384 (since -d 266354688 was given on the command line)
...
Sep 11 16:16:18 m1 valgrind: [11/Sep/2018:16:16:18.164395617 +0000] - WARN - Security Initialization - SSL alert: Sending pin request to SVRCore. You may need to run systemd-tty-ask-password-agent to provide the password.
Sep 11 16:16:18 m1 valgrind: [11/Sep/2018:16:16:18.591547707 +0000] - INFO - Security Initialization - SSL info: Configured NSS Ciphers
...
Sep 11 16:16:41 m1 valgrind: [11/Sep/2018:16:16:41.272660335 +0000] - INFO - slapd_daemon - slapd started. Listening on All Interfaces port 389 for LDAP requests
Sep 11 16:16:41 m1 valgrind: [11/Sep/2018:16:16:41.281664936 +0000] - INFO - slapd_daemon - Listening on All Interfaces port 636 for LDAPS requests
Sep 11 16:16:41 m1 systemd: Started 389 Directory Server m1..
Sep 11 16:16:41 m1 systemd: Reached target 389 Directory Server.
Sep 11 16:16:41 m1 systemd: Starting 389 Directory Server.
Optionnal, verify there is a TCP listener for the LDAP service:
lsof -i :389
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
memcheck- 6505 ldapuser1 8u IPv6 46684 0t0 TCP *:ldap (LISTEN)
Verify there is a valgrind output:
ls -lh /var/tmp/valg*
-rw-r--r--. 1 root root 60K Sep 11 16:16 /var/tmp/valgrind.dirsrv.out
less /var/tmp/valgrind.dirsrv.out
==6505== Memcheck, a memory error detector
==6505== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==6505== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==6505== Command: /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-m1 -i /var/run/dirsrv/slapd-m1.pid
==6505== Parent PID: 1
...
From that point let the LDAP service run, do the tests until there is evidence of a memory leak, eventually by using top, exampel with 4 samples only:
lsof -i :389 > top.log; echo "" >> top.log; top -d 1 -b -M -n 4 -p `pidof valgrind` -n 4 >> top.log
Or list first 4 processes by memory usage:
ps auwwx|gawk '!/%MEM/ {print $4,$11}'|sort -nr|head -n4
58.5 /usr/bin/valgrind
...
Then provide the valgrind output file for review.
References:
How to use OS utilities to track down application memory leaks - https://access.redhat.com/solutions/32526
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments