Chapter 10. Preventing monopolization of a replica in a multi-supplier replication topology

In a multi-supplier replication topology, a supplier under heavy update load can monopolize a replica so that other suppliers are not able to update it as well.

This section describes the circumstances when monopolization happens, how to identify this problem, and provides information on how to configure suppliers to avoid monopolization situations.

10.1. When monopolization happens

One of the features of multi-supplier replication is that a supplier acquires exclusive access to a replica. If the supplier attempts to acquire access while being locked out, the replica sends back a busy response, and the supplier waits for the time set in the nsds5ReplicaBusyWaitTime parameter before it starts another attempt. In the meantime, the supplier sends its update to another replica. When the first replica is free again, the supplier sends the updates to this host.

It can be a problem if the supplier that is locked out is under a heavy update load or has a lot of pending updates in the changelog. In this situation, the locking supplier finishes sending updates and immediately attempts to reacquire the same replica. Such an attempt succeeds in most cases, because other suppliers might still be waiting. You can set a pause between two update sessions in the nsds5ReplicaSessionPauseTime parameter. This can cause a single supplier to monopolize a replica for several hours or longer.

10.2. Enabling replication logging to identify monopolization of replicas

If one or more suppliers are often under a heavy update load, and replicas frequently do not receive updates, enable logging of replication messages to identify monopolization situations.

Prerequisites

  • There are multiple suppliers in the replication topology.

Procedure

  1. Enable replication logging:

    # dsconf -D "cn=Directory Manager" ldap://server.example.com config replace nsslapd-errorlog-level=8192

    Note that this command enables only replication logging, and logging other error messages is disabled.

  2. Monitor the /var/log/dirsrv/slapd-instance_name/errors log file and search for the following error message:

    Replica Busy! Status: [Error (1) Replication error acquiring replica: replica busy]

    Note that it is normal if Directory Server occasionally logs this error. However, if replicas frequently do not receive updates, and the suppliers log this error, consider updating your configuration to solve this problem.

10.3. Configuring suppliers to avoid monopolization of replicas

This procedure describes how to set parameters on a supplier to prevent monopolization of replicas.

Due to the differences of environments and load, set only the parameters that are relevant in your situation, and adjust the values according to your environment.

Prerequisites

  • There are multiple suppliers in the replication topology.
  • Directory Server frequently logs Replica Busy! Status: [Error (1) Replication error acquiring replica: replica busy] errors.

Procedure

  1. Set the nsds5ReplicaBusyWaitTime parameter to configure the time a supplier waits before starting another attempt to acquire access to a replica after the replica sent a busy response:

    # dsconf -D "cn=Directory Manager" ldap://supplier.example.com repl-agmt set --suffix "dc=example,dc=com" --busy-wait-time 5 replication_agreement_name

    This command sets the time to wait to 5 seconds. This setting applies only to the specified replication agreement.

  2. Set the nsds5ReplicaSessionPauseTime parameter to configure the time a supplier waits between two update sessions:

    # dsconf -D "cn=Directory Manager" ldap://supplier.example.com repl-agmt set --suffix "dc=example,dc=com" --session-pause-time 15 replication_agreement_name

    This command sets the pause to 15 seconds. By default, nsds5ReplicaSessionPauseTime is one second longer than the value in nsds5ReplicaBusyWaitTime. This setting applies only to the specified replication agreement.

  3. Set the nsds5ReplicaReleaseTimeout parameter to terminate replication sessions after a given amount of time regardless of whether or not sending the update is complete:

    # dsconf -D "cn=Directory Manager" ldap://supplier.example.com replication set --suffix "dc=example,dc=com" --repl-release-timeout 90

    This command sets the timeout to 90 seconds. This setting applies to all replication agreements for the specified suffix.

  4. Optional: Set a timeout period for a supplier so that it does not stay connected to a consumer infinitely attempting to send updates over a slow or broken connection:

    # dsconf -D "cn=Directory Manager" ldap://supplier.example.com repl-agmt set --conn-timeout 600 --suffix "dc=example,dc=com" replication_agreement_name

    This command sets the timeout to 600 seconds (10 minutes). To identify the optimum value, check the access log for the average amount of time the replication process takes, and set the timeout period accordingly.