3. Tracking Server and Database Performance

Red Hat Directory Server has two methods of recording and tracking performance data: performance counters and logs. Counters are used to determine how well the Directory Server performing, particularly in database performance; logs are used to diagnose any problem areas with server and LDAP operations and configuration.
Performance counters focus on the operations and information of the Directory Server for the server, all configured databases, and database links (chaining databases).
There are three types of logs: access (for client connections), errors (for errors, warnings, and details of events), and audit (changes to Directory Server configuration). The access and error logs run by default (and the errors log is required for the server to run). Audit logging, because of the overhead, must be enabled manually.

Note

The access log is buffered. This allows full access logging even with highly loaded servers, but there is a time lag between when the event occurs in the server and when the event is written to the log.

3.1. Monitoring Server Activity

The Directory Server's current activities can be monitored from either the Directory Server Console or the command line. It is also possible to monitor the activity of the caches for all of the database.

Note

Some of the counters for Directory Server database attributes monitored by server use 64-bit integers, even on 32-bit systems (total connections, operations initiated, operations completed, entries sent, and bytes sent). On high-volume systems, this keeps the counters from rolling too quickly and skewing monitoring data.

3.1.1. Monitoring the Server from the Directory Server Console

  1. In the Directory Server Console, select the Status tab.
  2. In the navigation tree, select Performance Counters.
    The Status tab in the right pane displays current information about server activity. If the server is currently not running, this tab will not provide performance monitoring information.
The General Information table shows basic information about the server, which helps set a baseline about the statistics that have been gathered.

Table 1. General Information (Server)

Field Description
Server Version Identifies the current server version.
Startup Time on Server The date and time the server was started.
Current Time on Server The current date and time on the server.
The Resource Summary table shows the totals of all operations performed by that instance.

Table 2. Resource Summary

Resource Usage Since Startup Average Per Minute
Connections The total number of connections to this server since server startup. Average number of connections per minute since server startup.
Operations Initiated The total number of operations initiated since server startup. Operations include any client requests for server action,such as searches, adds, and modifies. Often, multiple operations are initiated for each connection. Average number of operations per minute since server startup.
Operations Completed The total number of operations completed by the server since server startup. Average number of operations per minute since server startup.
Entries Sent to Clients The total number of entries sent to clients since server startup. Entries are sent to clients as the result of search requests. Average number of entries sent to clients per minute since server startup.
Bytes Sent to Clients The total number of bytes sent to clients since server startup. Average number of bytes sent to clients per minute since server startup.
The Current Resource Usage table shows the current demands on the server.

Table 3. Current Resource Usage

Resource Current Total
Active Threads The current number of active threads used for handling requests. Additional threads may be created by internal server tasks, such as replication or chaining.
Open Connections The total number of open connections. Each connection can account for multiple operations, and therefore multiple threads.
Remaining Available Connections The total number of remaining connections that the server can concurrently open. This number is based on the number of currently open connections and the total number of concurrent connections that the server is allowed to open. In most cases, the latter value is determined by the operating system and is expressed as the number of file descriptors available to a task.
Threads Waiting to Write to Client The total number of threads waiting to write to the client. Threads may not be immediately written when the server must pause while sending data to a client. Reasons for a pause include a slow network, a slow client, or an extremely large amount of information being sent to the client.
Threads Waiting to Read from Client The total number of threads waiting to read from the client. Threads may not be immediately read if the server starts to receive a request from the client, and then the transmission of that request is halted for some reason. Generally, threads waiting to read are an indication of a slow network or client.
Databases in Use The total number of databases being serviced by the server.
The Connection Status table simply lists the current active connections, with related connection information.

Table 4. Connection Status

Table Header Description
Time Opened The time on the server when the connection was initially opened.
Started The number of operations initiated by this connection.
Completed The number of operations completed by the server for this connection.
Bound as The distinguished name used by the client to bind to the server. If the client has not authenticated to the server, the server displays not bound in this field.
Read/Write Indicates whether the server is currently blocked for read or write access to the client. There are two possible values:
Not blocked means that the server is idle,actively sending data to the client, or actively reading data from the client.
Blocked means that the server is trying to send data to the client or read data from the client but cannot. The probable cause is a slow network or client.
The Global Database Cache table lists the cache information for all databases within the Directory Server instance.

Note

Although the performance counter for the global database cache is listed with the other server performance counters in the Directory Server Console, the actual database cache entries are located and monitored in cn=monitor,cn=database_instance,cn=ldbm database,cn=plugins,cn=config, as are the other database activities.

Table 5. Global Database Cache Information

Table Header Description
Hits The number of times the server could process a request by obtaining data from the cache rather than by going to the disk.
Tries The total number of database accesses since server startup.
Hit Ratio The ratio of cache tries to successful cache hits. The closer this number is to 100%, the better.
Pages Read In The number of pages read from disk into the cache.
Pages Written Out The number of pages written from the cache back to disk.
Read-Only Page Evicts The number of read-only pages discarded from the cache to make room for new pages. Pages discarded from the cache have to be written to disk, possibly affecting server performance. The lower the number of page evicts the better.
Read-Write Page Evicts The number of read-write pages discarded from the cache to make room for new pages. This value differs from Pages Written Out in that these are discarded read-write pages that have not been modified. Pages discarded from the cache have to be written to disk, possibly affecting server performance. The lower the number of page evicts, the better.

3.1.2. Monitoring the Directory Server from the Command Line

The Directory Server's current activities can be monitored using LDAP tools such as ldapsearch, with the following characteristics:
  • Search with the attribute filter objectClass=*.
  • Use the search base cn=monitor; the monitoring attributes for the server are found in the cn=monitor entry.
  • Use the search scope base.
For example:
ldapsearch -D "cn=directory manager" -W -p 389 -h server.example.com -x -s base -b "cn=monitor" "(objectclass=*)"
The monitoring attributes for the Directory Server are found in the cn=monitor entry.

Table 6. Server Monitoring Attributes

Attribute Description
version Identifies the directory's current version number.
threads The current number of active threads used for handling requests. Additional threads may be created by internal server tasks, such as replication or chaining.
connection:fd:opentime:opsinitiated:opscompleted:binddn:[rw] Provides the following summary information for each open connection (only available if you bind to the directory as Directory Manager):
fd — The file descriptor used for this connection.
opentime — The time this connection was opened.
opsinitiated — The number of operations initiated by this connection.
opscompleted — The number of operations completed.
binddn — The distinguished name used by this connection to connect to the directory.
rw — The field shown if the connection is blocked for read or write.
By default, this information is available to Directory Manager. However, the ACI associated with this information can be edited to allow others to access the information.
currentconnections Identifies the number of connections currently in service by the directory.
totalconnections Identifies the number of connections handled by the directory since it started.
dtablesize Shows the number of file descriptors available to the directory. Each connection requires one file descriptor: one for every open index, one for log file management, and one for ns-slapd itself. Essentially, this value shows how many additional condncurrent connections can be serviced by the directory. For more information on file descriptors, see the operating system documentation.
readwaiters Identifies the number of threads waiting to read data from a client.
opsinitiated Identifies the number of operations the server has initiated since it started.
opscompleted Identifies the number of operations the server has completed since it started.
entriessent Identifies the number of entries sent to clients since the server started.
bytessent Identifies the number of bytes sent to clients since the server started.
currenttime Identifies the time when this snapshot of the server was taken. The time is displayed in Greenwich Mean Time (GMT) in UTC format.
starttime Identifies the time when the server started. The time is displayed in Greenwich Mean Time (GMT) in UTC format.
nbackends Identifies the number of back ends (databases) the server services.
backendmonitordn Identifies the DN of each directory database.

3.2. Monitoring Database Activity

Note

Some of the counters for Directory Server database attributes monitored by server use 64-bit integers, even on 32-bit systems (entry cache hits, entry cache tries, the current cache size, and the maximum cache size). On high-volume systems, this keeps the counters from rolling too quickly and skewing monitoring data.

3.2.1. Monitoring Database Activity from the Directory Server Console

  1. In the Directory Server Console, select the Status tab.
  2. In the navigation tree, expand the Performance Counters folder, and select the database to monitor.
    The tab displays current information about database activity. If the server is currently not running, this tab will not provide performance monitoring information.
The Summary Information section shows the cumulative information for all of the databases being monitored and some cache-related configuration settings which are applied to all databases.

Table 7. Summary Information

Performance Metric Current Total
Read-Only Status Shows whether the database is currently in read-only mode. The database is in read-only mode when the nsslapd-readonly attribute is set to on.
Entry Cache Hits The total number of successful entry cache lookups. That is, the total number of times the server could process a search request by obtaining data from the cache rather than by going to disk.
Entry Cache Tries The total number of entry cache lookups since the directory was last started. That is, the total number of entries requested since server startup.
Entry Cache Hit Ratio
Ratio that indicates the number of entry cache tries to successful entry cache lookups. This number is based on the total lookups and hits since the directory was last started. The closer this value is to 100%, the better. Whenever an operation attempts to find an entry that is not present in the entry cache, the directory has to perform a disk access to obtain the entry. Thus, as this ratio drops towards zero, the number of disk accesses increases, and directory search performance drops.
To improve this ratio, increase the size of the entry cache by increasing the value of the nsslapd-cachememsize attribute in the cn=database_name, cn=ldbm database,cn=plugins,cn=config entry for the database. In the Directory Server Console, this is set in the Memory available for cache field in the database settings.
Current Entry Cache Size (in Bytes) The total size of directory entries currently present in the entry cache.
Maximum Entry Cache Size (in Bytes)
The size of the entry cache maintained by the directory.
This value is managed by the nsslapd-cachememsize attribute in the cn=database_name, cn=ldbm database,cn=plugins,cn=config entry for the database. This is set in the Memory available for cache field in the database settings in the Directory Server Console.
Current Entry Cache Size (in Entries) The number of directory entries currently present in the entry cache.
Maximum Entry Cache Size (in Entries)
DEPRECATED.
The maximum number of directory entries that can be maintained in the entry cache.
Do not attempt to manage the cache size by setting a maximum number of allowed entries. This can make it difficult for the host to allocate RAM effectively. Manage the cache size by setting the amount of RAM available to the cache, using the nsslapd-cachememsize attribute.
There are many different databases listed for the database monitoring page, by default, because databases are maintained for both entries and indexed attributes. All databases, though, have the same kind of cache information monitored in the counters.

Table 8. Database Cache Information

Performance Metric Current Total
Hits The number of times the database cache successfully supplied a requested page.
Tries The number of times the database cache was asked for a page.
Hit Ratio
The ratio of database cache hits to database cache tries. The closer this value is to 100%, the better. Whenever a directory operation attempts to find a portion of the database that is not present in the database cache, the directory has to perform a disk access to obtain the appropriate database page. Thus, as this ratio drops towards zero, the number of disk accesses increases, and directory performance drops.
To improve this ratio, increase the amount of data that the directory maintains in the database cache by increasing the value of the nsslapd-dbcachesize attribute. This is the Maximum Cache Size database setting in the Directory Server Console.
Pages Read In The number of pages read from disk into the database cache.
Pages Written Out The number of pages written from the cache back to disk. A database page is written to disk whenever a read-write page has been modified and then subsequently deleted from the cache. Pages are deleted from the database cache when the cache is full and a directory operation requires a database page that is not currently stored in cache.
Read-Only Page Evicts The number of read-only pages discarded from the cache to make room for new pages.
Read-Write Page Evicts The number of read-write pages discarded from the cache to make room for new pages. This value differs from Pages Written Out in that these are discarded read-write pages that have not been modified.

Table 9. Database File-Specific

Performance Metric Current Total
Cache Hits The number of times that a search result resulted in a cache hit on this specific file. That is, a client performs a search that requires data from this file, and the directory obtains the required data from the cache.
Cache Misses The number of times that a search result failed to hit the cache on this specific file. That is, a search that required data from this file was performed, and the required data could not be found in the cache.
Pages Read In The number of pages brought to the cache from this file.
Pages Written Out The number of pages for this file written from cache to disk.

3.2.2. Monitoring Database Activity from the Command Line

A database's current activities can be monitored using LDAP tools such as ldapsearch. The search targets the monitoring subtree of the LDBM database entry, cn=monitor,cn=database_name,cn=ldbm database,cn=plugins,cn=config. This contains all of the monitoring attributes for the that specific database instance.
For example:
ldapsearch -D "cn=directory manager" -W -p 389 -h server.example.com -x -s base -b "cn=monitor,cn=database_name,cn=ldbm database,cn=plugins,cn=config" "(objectclass=*)"

Table 10. Database Monitoring Attributes

Attribute Description
database Identifies the type of database currently being monitored.
readonly Indicates whether the database is in read-only mode; 0 means that the server is not in read-only mode, 1 means that it is in read-only mode.
entrycachehits The total number of successful entry cache lookups. That is, the total number of times the server could process a search request by obtaining data from the cache rather than by going to disk.
entrycachetries The total number of entry cache lookups since the directory was last started. That is, the total number of search operations performed against the server since server startup.
entrycachehitratio
Ratio that indicates the number of entry cache tries to successful entry cache lookups. This number is based on the total lookups and hits since the directory was last started. The closer this value is to 100%, the better. Whenever a search operation attempts to find an entry that is not present in the entry cache, the directory has to perform a disk access to obtain the entry. Thus, as this ratio drops towards zero, the number of disk accesses increases, and directory search performance drops.
To improve this ratio, increase the size of the entry cache by increasing the value of the nsslapd-cachememsize attribute in the cn=database_name, cn=ldbm database,cn=plugins,cn=config entry for the database. In the Directory Server Console, this is set in the Memory available for cache field in the database settings.
currententrycachesize
The total size, in bytes, of directory entries currently present in the entry cache.
To increase the size of the entries which can be present in the cache, increase the value of the nsslapd-cachememsize attribute in the cn=database_name, cn=ldbm database,cn=plugins,cn=config entry for the database. In the Directory Server Console, this is set in the Memory available for cache field in the database settings.
maxentrycachesize
The maximum size, in bytes, of directory entries that can be maintained in the entry cache.
To increase the size of the entries which can be present in the cache, increase the value of the nsslapd-cachememsize attribute in the cn=database_name, cn=ldbm database,cn=plugins,cn=config entry for the database. In the Directory Server Console, this is set in the Memory available for cache field in the database settings.
dbcachehits The number of times the server could process a request by obtaining data from the cache rather than by going to the disk.
dbcachetries The total number of database accesses since server startup.
dbcachehitratio The ratio of cache tries to successful cache hits. The closer this number is to 100%, the better.
dbcachepagein The number of pages read from disk into the cache.
dbcachepageout The number of pages written from the cache back to disk.
dbcacheroevict The number of read-only pages discarded from the cache to make room for new pages. Pages discarded from the cache have to be written to disk, possibly affecting server performance. The lower the number of page evicts the better.
dbcacherwevict The number of read-write pages discarded from the cache to make room for new pages. This value differs from Pages Written Out in that these are discarded read-write pages that have not been modified. Pages discarded from the cache have to be written to disk, possibly affecting server performance. The lower the number of page evicts the better.
dbfilename-number The name of the file. number provides a sequential integer identifier (starting at 0) for the file. All associated statistics for the file are given this same numerical identifier.
dbfilecachehit-number The number of times that a search result resulted in a cache hit on this specific file. That is, a client performs a search that requires data from this file, and the directory obtains the required data from the cache.
dbfilecachemiss-number The number of times that a search result failed to hit the cache on this specific file. That is, a search that required data from this file was performed, and the required data could not be found in the cache.
dbfilepagein-number The number of pages brought to the cache from this file.
dbfilepageout-number The number of pages for this file written from cache to disk.
currentdncachesize
The total size, in bytes, of DNs currently present in the DN cache.
To increase the size of the entries which can be present in the DN cache, increase the value of the nsslapd-dncachememsize attribute in the cn=database_name, cn=ldbm database,cn=plugins,cn=config entry for the database.
maxdncachesize
The maximum size, in bytes, of DNs that can be maintained in the DN cache.
To increase the size of the entries which can be present in the cache, increase the value of the nsslapd-dncachememsize attribute in the cn=database_name, cn=ldbm database,cn=plugins,cn=config entry for the database.
currentdncachecount The number of DNs currently present in the DN cache.

3.4. Monitoring the Local Disk for Graceful Shutdown

When the disk space available on a system becomes too small, the Directory Server process (slapd) crashes. Any abrupt shutdown runs the risk of corrupting the database or losing directory data.
It is possible to monitor the disk space available to the slapd process. A disk monitoring thread is enabled using the nsslapd-disk-monitoring configuration attribute. This creates a monitoring thread that wakes every ten (10) seconds to check for available disk space in certain areas.
If the disk space approaches a defined threshold, then the slapd begins a series of steps (by default) to reduce the amount of disk space it is consuming:
  • Verbose logging is disabled.
  • Access logging and error logging are disabled.
  • Rotated (archived) logs are deleted.

Note

Error log messages are always recorded, even when other changes are made to the logging configuration.
If the available disk space continues to drop to half of the configured threshold, then the slapd begins a graceful shut down process (within a grace period); and if the available disk space ever drops to 4KB, then the slapd process shuts down immediately. If the disk space is freed up, then the shutdown process is aborted, and all of the previously disabled log settings are re-enabled.
By default, the monitoring thread checks the configuration, transaction log, and database directories. An additional attribute (nsslapd-disk-monitoring-logging-critical) can be set to include the logs directory when evaluating disk space.
Disk monitoring is disabled by default, but it can be enabled and configured by adding the appropriate configuration attributes to the cn=config entry. Table 12, “Disk Monitoring Configuration Attributes” lists all of the configuration options.
  1. Using ldapmodify, add the disk monitoring attributes. At a minimum, turn on the nsslapd-disk-monitoring attribute to enable disk monitoring. The default threshold is 2MB; this can be configured (optionally) in the nsslapd-disk-monitoring-threshold attribute.
    For example:
    [jsmith@server ~]$ ldapmodify -D "cn=directory manager" -W -p 389 -x
    dn: cn=config
    changetype: modify
    add: nsslapd-disk-monitoring
    nsslapd-disk-monitoring: on
    -
    add: nsslapd-disk-monitoring-threshold 
    nsslapd-disk-monitoring-threshold: 3000000
    -
    add: nsslapd-disk-monitoring-grace-period
    nsslapd-disk-monitoring-grace-period: 20
  2. Restart the Directory Server to load the new configuration.
    [root@server ~]# service dirsrv restart

Table 12. Disk Monitoring Configuration Attributes

Configuration Attribute Description
nsslapd-disk-monitoring Enabled disk monitoring. This is the only required attribute, since the other configuration options have usable defaults.
nsslapd-disk-monitoring-grace-period Sets a grace period to wait before shutting down the server after it hits half of the disk space limit. This gives an administrator time to address the situation. The default value is 60 (minutes).
nsslapd-disk-monitoring-logging-critical Sets whether to shut down the server if the log directories pass the halfway point set in the disk space limit.This prevents the monitoring thread from disabling audit or access logging or from deleting rotated logfiles.
nsslapd-disk-monitoring-threshold Sets the amount of disk space, in bytes, to use to evaluate whether the server has enough available disk space. Once the space reaches half of this threshold, then the server begins a shut down process. The default value is 2000000 (2MB).

3.5. Viewing Log Files

Note

The access and error logs are enabled by default and can be viewed immediately. before the audit log can be viewed, audit logging must be enabled for the directory, or the audit log will not be kept.
  1. In the Directory Server Console, select the Status tab.
  2. In the navigation tree, expand the Log folder. There are three folders available, for the access, error, and audit logs.
  3. When you select the log type to view, a table displays a list of the last 25 entries in the selected log.
  4. Optionally, change the settings of the log display and click Refresh to update the display.
    • The Select Log pull-down menu allows you to select an archived (rotated) log rather than the currently active log.
    • The Lines to show text box changes the number of log entries to display in the window.
    • The Show only lines containing text box sets a filter, including regular expressions, to use to display only certain matching log entries.

Note

Selecting the Continuous check box refreshes the log display automatically every ten seconds.= Continuous log refresh does not work well with log files over 10 megabytes.

3.6. Replacing Log Files with a Named Pipe

Many administrators want to do some special configuration or operation with logging data, like configuring an access log to record only certain events. This is not possible using the standard Directory Server log file configuration attributes, but it is possible by sending the log data to a named pipe, and then using another script or plug-in to process the data. Using a named pipe for the log simplifies these special tasks, like:
  • Logging certain events, like failed bind attempts or connections from specific users or IP addresses
  • Logging entries which match a specific regular expression pattern
  • Keeping the log to a certain length (logging only the last number of lines)
  • Sending a notification, such as an email, when an event occurs
The basic format of the script is is:

ds-logpipe.py named_pipe [ --user pipe_user ] [ --maxlines number ] [[ --serverpidfile file.pid ] | [ --serverpid PID ]] [ --servertimeout seconds ] [ --plugin=/path/to/plugin.py | [ pluginfile.arg=value ]]

More detailed usage information is in the Configuration, Command, and File Reference.
However, while that has the advantage of being simple to implement and not requiring any Directory Server configuration changes, simply running the script has a big disadvantage: all of the log viewers in the Directory Server Console and any script or tool (such as logconv.pl) that expect to access a real file will fail.
If the Directory Server instance will permanently use the named pipe rather than a real file for logging, then it is possible to reconfigure the instance to create the named pipe and use it for logging (as it does by default for the log files). When the Directory Server instance is configured to use the named pipe then all of the log analysis tools — the Directory Server Console and any Directory Server scripts — work fine.
Three things need to be configured for the log configuration for the instance:
  • The log file to use has to be changed to the pipe (nsslapd-*log)
  • Buffering should be disabled because the script already buffers the log entries (nsslapd-*log-logbuffering)
  • Log rotation should be disabled so that the server does not attempt to rotate the named pipe (nsslapd-*log-maxlogsperdir, nsslapd-*log-logexpirationtime, and nsslapd-*log-logrotationtime)
These configuration changes can be made in the Directory Server Console or using ldapmodify.
For example, this switches the access log to access.pipe:
ldapmodify -D "cn=directory manager" -W -p 389 -h server.example.com -x

dn: cn=config
changetype: modify
replace: nsslapd-accesslog
nsslapd-accesslog: /var/log/dirsrv/slapd-instance_name/access.pipe
-
replace: nsslapd-accesslog-logbuffering
nsslapd-accesslog-logbuffering: off
-
replace: nsslapd-accesslog-maxlogsperdir
nsslapd-accesslog-maxlogsperdir: 1
- 
replace: nsslapd-accesslog-logexpirationtime
nsslapd-accesslog-logexpirationtime: -1
- 
replace: nsslapd-accesslog-logrotationtime
nsslapd-accesslog-logexpirationtime: -1

Note

Making these changes using the -f option will cause the server to close the current log file and switch to the named pipe immediately. This can be very helpful for debugging a running server and sifting the log output for specific messages.

3.7. Improving Logging Performance

Larger server deployments can generate several dozen of megabytes of logs per hour. Depending on the resources available on the server host machine, reconfiguring or disabling access logging can improve system and Directory Server performance.
Before disabling access logging, first configure access log buffering. Buffering writes all log entries directly to the disk, so that the Directory Server performance does not degrade even under a heavy load.
The access log is buffered by default, but make sure the log is using buffering for best performance.
ldapmodify -D "cn=directory manager" -W -p 389 -x

dn: cn=config
changetype: modify
replace: nsslapd-accesslog-logbuffering
nsslapd-accesslog-logbuffering: on
If that does not improve performance, then disable access logging entirely.
ldapmodify -D "cn=directory manager" -W -p 389 -x

dn: cn=config
changetype: modify
replace: nsslapd-accesslog-enabled
nsslapd-accesslog-enabled: off

Warning

Access logging is extremely helpful for debugging issues in the server and monitoring client connections and failed connection attempts. Don't disable access logging as the normal operating environment.
For alternatives, see Section 3.6, “Replacing Log Files with a Named Pipe”, since using named pipe log scripts can improve performance while still logging information on high performance production servers.