15.25. Solving Common Replication Conflicts
Multi-supplier replication uses an eventually-consistency replication model. This means that the same entries can be changed on different servers. When replication occurs between these two servers, the conflicting changes need to be resolved. Mostly, resolution occurs automatically, based on the time stamp associated with the change on each server. The most recent change takes precedence.
However, there are some cases where conflicts require manual intervention in order to reach a resolution. Entries with a change conflict that cannot be resolved automatically by the replication process.
To list conflict entries, enter:
# dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict list dc=example,dc=com
15.25.1. Solving Naming Conflicts
When two entries are created with the same DN on different servers, the automatic conflict resolution procedure during replication renames the last entry created, including the entry's unique identifier in the DN. Every directory entry includes a unique identifier stored in the
nsuniqueid
operational attribute. When a naming conflict occurs, this unique ID is appended to the non-unique DN.
For example, if the
uid=user_name,ou=People,dc=example,dc=com
entry was created on two different servers, replication adds the unique ID to the DN of the last entry created. This means, the following entries exist:
uid=user_name,ou=People,dc=example,dc=com
nsuniqueid=66446001-1dd211b2+uid=user_name,ou=People,dc=example,dc=com
To resolve the replication conflict, you must manually decide how to proceed:
- To keep only the valid entry (
uid=user_name,ou=People,dc=example,dc=com
) and delete the conflict entry, enter:# dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict delete nsuniqueid=66446001-1dd211b2+uid=user_name,ou=People,dc=example,dc=com
- To keep only the conflict entry (
nsuniqueid=66446001-1dd211b2+uid=user_name,ou=People,dc=example,dc=com
), enter:# dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict swap nsuniqueid=66446001-1dd211b2+uid=user_name,ou=People,dc=example,dc=com
- To keep both entries, enter:
# dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict convert --new-rdn=uid=user_name_NEW nsuniqueid=66446001-1dd211b2+uid=user_name,ou=People,dc=example,dc=com
To keep the conflict entry, you must specify a new Relative Distinguished Name (RDN) in the--new-rdn
option to rename the conflict entry. The previous command renames the conflict entry touid=user_name_NEW,ou=People,dc=example,dc=com
.
15.25.2. Solving Orphan Entry Conflicts
When a delete operation is replicated and the consumer server finds that the entry to be deleted has child entries, the conflict resolution procedure creates a
glue
entry to avoid having orphaned entries in the directory.
In the same way, when an add operation is replicated and the consumer server cannot find the parent entry, the conflict resolution procedure creates a glue entry representing the parent so that the new entry is not an orphan entry.
Glue entries are temporary entries that include the object classes
glue
and extensibleObject
. Glue entries can be created in several ways:
- If the conflict resolution procedure finds a deleted entry with a matching unique identifier, the glue entry is a resurrection of that entry, with the addition of the
glue
object class and thensds5ReplConflict
attribute.In such cases, either modify the glue entry to remove theglue
object class and thensds5ReplConflict
attribute to keep the entry as a normal entry or delete the glue entry and its child entries. - The server creates a minimalistic entry with the
glue
andextensibleObject
object classes.
In such cases, modify the entry to turn it into a meaningful entry or delete it and all of its child entries.
To list all glue entries:
# dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict list-glue suffix
To delete a glue entry and its child entries:
# dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict delete-glue DN_of_glue_entry
To convert a glue entry into a regular entry:
# dsconf -D "cn=Directory Manager" ldap://server.example.com repl-conflict convert-glue DN_of_glue_entry
15.25.3. Resolving Errors for Obsolete or Missing Suppliers
Information about the replication topology, that is all suppliers which supply updates to each other and other replicas within the same replication group, is contained in a set of metadata called the replica update vector (RUV). The RUV contains information about the supplier such as its ID and URL, its latest change state number (CSN) on the local server, and the CSN of the first change. Both suppliers and consumers store RUV information, and they use it to control replication updates.
When one supplier is removed from the replication topology, it may remain in another replica's RUV. When the other replica is restarted, it can record errors in its log, warning that the replication plug-in does not recognize the removed supplier. The errors will look similar to the following example:
[22/Jan/2021:17:16:01 -0500] NSMMReplicationPlugin - ruv_compare_ruv: RUV [changelog max RUV] does not contain element [{replica 8 ldap://m2.example.com:389} 4aac3e59000000080000 4c6f2a02000000080000] which is present in RUV [database RUV] <...several more samples...> [22/Jan/2021:17:16:01 -0500] NSMMReplicationPlugin - replica_check_for_data_reload: Warning: for replica dc=example,dc=com there were some differences between the changelog max RUV and the database RUV. If there are obsolete elements in the database RUV, you should remove them using the CLEANALLRUV task. If they are not obsolete, you should check their status to see why there are no changes from those servers in the changelog.
Note which replica and its ID; in this case, replica
8
.
When the supplier is permanently removed from the topology, then any lingering metadata about that supplier should be purged from every other supplier's RUV entry. Use the
cleanallruv
directory task to remove a RUV entry from all suppliers in the topology.
Note
The
cleanallruv
task is replicated. Therefore, you only need to run it on one supplier.
Procedure 15.1. Removing an Obsolete or Missing Supplier Using the cleanallruv
Task Operation
- List all RUV records and replica IDs, both valid and invalid, as deleted suppliers may have left metadata on other suppliers:
# ldapsearch -o ldif-wrap=no -xLLL -H m1.example.com -D "cn=Directory Manager" -W -b dc=example,dc=com '(&(nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff)(objectclass=nstombstone))' nsDS5ReplicaId nsDS5ReplicaType nsds50ruv
dn: cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping tree,cn=config nsDS5ReplicaId: 1 nsDS5ReplicaType: 3 nsds50ruv: {replicageneration} 55d5093a000000010000 nsds50ruv: {replica 1 ldap://m1.example.com:389} 55d57026000000010000 55d57275000000010000 nsds50ruv: {replica 20 ldap://m2.example.com:389} 55e74b8c000000140000 55e74bf7000000140000 nsds50ruv: {replica 9 ldap://m2.example.com:389} nsds50ruv: {replica 8 ldap://m2.example.com:389} 506f921f000000080000 50774211000500080000Note the returned replica IDs:1
,20
,9
, and8
. - List the currently defined and valid replica IDs of all suppliers which are replicating databases by searching the replica configuration entries DN
cn=replica
under thecn=config
suffix.Note
Consumers and read-only nodes always have the replica ID set to65535
, andnsDS5ReplicaType: 3
signifies a supplier.# ldapsearch -o ldif-wrap=no -xLLL -H m1.example.com m2.example.com -D "cn=Directory Manager" -W -b cn=config cn=replica nsDS5ReplicaId nsDS5ReplicaType
dn: cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping tree,cn=config nsDS5ReplicaId: 1 nsDS5ReplicaType: 3 dn: cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping tree,cn=config nsDS5ReplicaId: 20 nsDS5ReplicaType: 3After you search all URIs returned in the first step (in this procedure,m1.example.com
andm2.example.com
), compare the list of returned suppliers (entries which havensDS5ReplicaType: 3
) to the list of RUVs from the previous step. In the above example, this search only returned IDs1
and20
, but the previous search also returned9
and8
on URIm2.example.com
. This means that the latter two are removed, and their RUVs need to be cleaned. - After determining which RUVs require cleaning, create a new
cn=cleanallruv,cn=tasks,cn=config
entry and provide the following information about your replication configuration:- The base DN of the replicated database (
replica-base-dn
) - The replica ID (
replica-id
) - Whether to catch up to the maximum change state number (CSN) from the missing supplier, or whether to just remove all RUV entries and miss any updates (
replica-force-cleaning
); setting this attribute tono
means that the task will wait for all the configured replicas to catch up with all the changes from the removed replica first, and then remove the RUV.
# dsconf -D "cn=Directory Manager" ldap://m2.example.com repl-tasks \ cleanallruv --suffix="dc=example,dc=com" --replica-id=8
Note
Thecleanallruv
task is replicated. Therefore, you only need to run it on one supplier.Repeat the same for every RUV you want to clean (ID9
in this procedure). - After cleaning the RUVs of all replicas discovered earlier, you can again use the search from the first step to verify that all extra RUVs are removed:
# ldapsearch -o ldif-wrap=no -xLLL -H m1.example.com -D "cn=Directory Manager" -W -b dc=example,dc=com '(&(nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff)(objectclass=nstombstone))' nsDS5ReplicaId nsDS5ReplicaType nsds50ruv
dn: cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping tree,cn=config nsDS5ReplicaId: 1 nsDS5ReplicaType: 3 nsds50ruv: {replicageneration} 55d5093a000000010000 nsds50ruv: {replica 1 ldap://m1.example.com:389} 55d57026000000010000 55d57275000000010000 nsds50ruv: {replica 20 ldap://m2.example.com:389} 55e74b8c000000140000 55e74bf7000000140000As you can see in the above output, replica IDs8
and9
are no longer present, which indicates that their RUVs have been cleaned successfully.