Chapter 19. Solving common replication problems

Multi-supplier replication uses an eventually-consistency replication model. This means that the same entries can be changed on different servers. When replication occurs between these two servers, Directory Server needs to resolve the conflicting changes. Mostly, resolution occurs automatically, based on the timestamp associated with the change on each server. The most recent change has priority. However, there are some cases where conflicts require manual intervention in order to reach a resolution.

19.1. Identifying and solving naming conflicts

When several supplier servers receive a request to create an entry with the same distinguished name (DN), each server creates the entry with this DN and a different entry unique identifier (entry ID). The entry ID is stored in the nsuniqueid operational attribute.

For example, Server A and Server B receive a request to create uid=user_name,ou=people,dc=example,dc=com user entry. As a result, each server has its own entry:

  • On Server A, the entry has:

    • uid=user_name,ou=people,dc=example,dc=com
    • nsuniqueid=a7f1758b-512211ec-b115e2e9-7dc2d46b
  • On Server B, the entry has:

    • uid=user_name,ou=people,dc=example,dc=com
    • nsuniqueid=643a461e-b61311e1-b23be826-4afeed5f

During replication, Server A replicates newly created entry uid=user_name,ou=people,dc=example,dc=com to Server B, and Server B replicates newly created entry to Server A, and a naming conflict occurs on each server. By comparing change sequence numbers (CSN), each server determines which entry was created earlier. For example, the entry on Server B was created earlier.

The automatic conflict resolution procedure changes the last entry created (the entry on Server A) the following way:

  • Adds the nsuniqueid value to the non-unique DN.
  • Adds the nsds5replconflict attribute with the description which operation caused the conflict.
  • Adds the ldapsubentry objectclass.

Now the following entries exist on both servers:

  • The valid entry with:

    • uid=user_name,ou=people,dc=example,dc=com
    • nsuniqueid=643a461e-b61311e1-b23be826-4afeed5f
  • The conflict entry with:

    • nsuniqueid=a7f1758b-512211ec-b115e2e9-7dc2d46b+uid=user_name,ou=people,dc=example,dc=com
    • nsuniqueid=a7f1758b-512211ec-b115e2e9-7dc2d46b

To solve the naming conflict manually, use the following procedure on each server.

Procedure

  1. List the conflict entries:

    # dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict list dc=example,dc=com
    dn: nsuniqueid=a7f1758b-512211ec-b115e2e9-7dc2d46b+uid=user_name,ou=people,dc=example,dc=com
    cn: user_name
    displayName: user
    gidNumber: 99998
    homeDirectory: /var/empty
    legalName: user name
    loginShell: /bin/false
    nsds5replconflict: namingConflict (ADD) uid=user_name,ou=people,dc=example,dc=com
    objectClass: top
    objectClass: nsPerson
    objectClass: nsAccount
    objectClass: nsOrgPerson
    objectClass: posixAccount
    objectClass: ldapsubentry
    uid: user_name
    uidNumber: 99998
  2. If conflict entries exist, decide how to proceed:

    • To keep only the valid entry (uid=user_name,ou=people,dc=example,dc=com) and delete the conflict entry, enter:

      # dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict delete nsuniqueid=a7f1758b-512211ec-b115e2e9-7dc2d46b+uid=user_name,ou=People,dc=example,dc=com
    • To keep only the conflict entry (nsuniqueid=a7f1758b-512211ec-b115e2e9-7dc2d46b+uid=user_name,ou=People,dc=example,dc=com) and delete the valid entry, enter:

      # dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict swap nsuniqueid=a7f1758b-512211ec-b115e2e9-7dc2d46b+uid=user_name,ou=People,dc=example,dc=com
    • To keep both entries, specify a new relative distinguished name (RDN) to rename the conflict entry:

      # dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict convert --new-rdn=uid=user_name_NEW nsuniqueid=a7f1758b-512211ec-b115e2e9-7dc2d46b+uid=user_name,ou=people,dc=example,dc=com

      This command renames the conflict entry to uid=user_name_NEW,ou=people,dc=example,dc=com.

Warning

Directory Server replicates LDAP operations performed on a conflict entry. Usually replicated operations target the entry by using the nsuniqueid of the original operation entry rather than by using the operation dn. However, in cases with conflict entries, the behavior might differ.

19.2. Identifying and solving orphan entry conflicts

When Directory Server replicates a delete operation and the consumer server finds that the entry to be deleted has child entries, the conflict resolution procedure creates a glue entry to avoid having orphaned entries in the directory.

In the same way, when Directory Server replicates an add operation and the consumer server cannot find the parent entry, the conflict resolution procedure creates a glue entry for the parent.

Glue entries are temporary entries that include the object classes glue and extensibleObject. Glue entries can be created in several ways:

  • If the conflict resolution procedure finds a deleted entry with a matching unique identifier, the glue entry has the same attributes as the deleted entry, but with the added glue object class and the nsds5ReplConflict attribute.

    In such cases, either modify the glue entry to remove the glue object class and the nsds5ReplConflict attribute to keep the entry as a normal entry or delete the glue entry and its child entries.

  • The server creates an entry with the glue and extensibleObject object classes.

Procedure

  1. List the orphan entry conflicts:

    # dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict list-glue suffix
    dn: ou=parent,dc=example,dc=com
    objectClass: top
    objectClass: organizationalunit
    objectClass: glue
    objectClass: extensibleobject
    ou: parent
  2. If orphan entry conflicts exist, decide how to proceed:

    • To delete a glue entry and its child entries, enter:

      # dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict delete-glue "ou=parent,dc=example,dc=com"
      dn: ou=parent,dc=example,dc=com
      objectClass: top
      objectClass: organizationalunit
      objectClass: extensibleobject
      ou: parent
    • To convert a glue entry into a regular entry, enter:

      # dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict convert-glue "ou=parent,dc=example,dc=com"

19.3. Identifying and solving errors about obsolete or missing suppliers

Directory Server stores information about the replication topology, such as all suppliers that send updates to other replicas, in a set of metadata called replica update vector (RUV). An RUV contains information about the supplier, such as its ID and URL, the last change state number (CSN) on the local server, and the CSN of the first change. Both suppliers and consumers store RUV information, and they use it to control replication updates.

When you remove a supplier from the replication topology, information about it can remain in another replica’s RUV. You can use a cleanallruv task to remove the RUV entry form all suppliers in the topology.

Prerequisites

  • Replication is enabled on.

Procedure

  1. Monitor the /var/log/dirsrv/slapd-instance_name/errors log file and search for entries similar to the following:

    [22/Jan/2021:17:16:01 -0500] NSMMReplicationPlugin - ruv_compare_ruv: RUV [changelog max RUV] does not contain element [{replica 8 ldap://server2.example.com:389} 4aac3e59000000080000 4c6f2a02000000080000] which is present in RUV [database RUV]
    ...
    [22/Jan/2021:17:16:01 -0500] NSMMReplicationPlugin - replica_check_for_data_reload: Warning: for replica dc=example,dc=com there were some differences between the changelog max RUV and the database RUV. If there are obsolete elements in the database RUV, you should remove them using the CLEANALLRUV task. If they are not obsolete, you should check their status to see why there are no changes from those servers in the changelog.

    In this case, the replica ID 8 causes this error.

  2. Display all RUV records and replica IDs, both valid and invalid:

    # dsconf -D "cn=Directory Manager" ldap://server1.example.com replication get-ruv --suffix "dc=example,dc=com"
    RUV:        {replica 1 ldap://server1.example.com} 61a4d8f8000100010000 61a4f5b8000000010000
    
    Replica ID: 1
    LDAP URL:   ldap://server1.example.com
    Min CSN:    2021-11-29 13:43:20 1 0 (61a4d8f8000100010000)
    Max CSN:    2021-11-29 15:46:00 (61a4f5b8000000010000)
    RUV:        {replica 2 ldap://server2.example.com} 61a4d8fb000100020000 61a4f550000000020000
    
    Replica ID: 2
    LDAP URL:   ldap://server2.example.com
    Min CSN:    2021-11-29 13:43:23 1 0 (61a4d8fb000100020000)
    Max CSN:    2021-11-29 15:44:16 (61a4f550000000020000)
     RUV:        {replica 8 ldap://server3.example.com} 61a4d903000100080000 61a4d908000000080000
    
    Replica ID: 8
    LDAP URL:   ldap://server3.example.com
    Min CSN:    2021-11-29 13:43:31 1 0 (61a4d903000100080000)
    Max CSN:    2021-11-29 13:43:36 (61a4d908000000080000)

    Note the list of returned replica IDs: 1, 2, and 8.

  3. Run cleanup tasks for the replica IDs 8.

    # dsconf -D "cn=Directory Manager" ldap://server1.example.com repl-tasks cleanallruv --suffix="dc=example,dc=com" --replica-id=8

    Note that Directory Server replicates RUV cleanup tasks. Therefore, you need to start the tasks on only one supplier.

    If one of the replicas can not be joined, for example if it is down, you can use the --force-cleaning option to achieve an immediate clean up of the RUV.

Verification

  • Display the RUV records and replica IDs:

    # dsconf -D "cn=Directory Manager" ldap://server1.example.com replication get-ruv --suffix "dc=example,dc=com"
    RUV:        {replica 1 ldap://server1.example.com} 61a4d8f8000100010000 61a4f5b8000000010000
    
    Replica ID: 1
    LDAP URL:   ldap://server1.example.com
    Min CSN:    2021-11-29 14:02:10 1 0 (61a4d8f8000100010000)
    Max CSN:    2021-11-29 16:00:00 (61a4f5b8000000010000)
    RUV:        {replica 2 ldap://server2.example.com} 61a4d8fb000100020000 61a4f550000000020000
    
    Replica ID: 2
    LDAP URL:   ldap://server2.example.com
    Min CSN:    2021-11-29 14:02:10 1 0 (61a4d8fb000100020000)
    Max CSN:    2021-11-29 15:58:22 (61a4f550000000020000)

    The command no longer returns RUV entries for the replica IDs 8.