Chapter 9. Managing Indexes
9.1. About Indexes
9.1.1. About Index Types
- Presence index (pres) contains a list of the entries that contain a particular attribute, which is very useful for searches. For example, it makes it easy to examine any entries that contain access control information. Generating an
aci.db4file that includes a presence index efficiently performs the search for
ACI=*to generate the access control list for the server.The presence index is not used for base object searches.
- Approximate index (approx) is used for efficient approximate or sounds-like searches. For example, an entry may include the attribute value
cn=Robert E Lee. An approximate search would return this value for searches against
cn~=Lee. Similarly, a search against
l~=San Fransisco(note the misspelling) would return entries including
- Substring index (sub) is a costly index to maintain, but it allows efficient searching against substrings within entries. Substring indexes are limited to a minimum of three characters for each entry.For example, searches of the form
cn=*derson, match the common names containing strings such as
Jill Henderson, or
Steve Sanderson. Similarly, the search for
telephoneNumber= *555*returns all the entries in the directory with telephone numbers that contain
- International index speeds up searches for information in international directories. The process for creating an international index is similar to the process for creating regular indexes, except that it applies a matching rule by associating an object identifier (OID) with the attributes to be indexed.The supported locales and their associated OIDs are listed in Appendix D, Internationalization. If there is a need to configure the Directory Server to accept additional matching rules, contact Red Hat Professional Services.
- Browsing index, or virtual list view (VLV) index, speeds up the display of entries in the Directory Server Console. This index is particularly useful if a branch of your directory contains hundreds of entries; for example, the
ou=peoplebranch. You can create a browsing index on any branch point in the directory tree to improve display performance through the Directory Server Console or by using the
vlvindexcommand-line tool, which is explained in the Directory Server Configuration and Command-Line Tool Reference.
9.1.2. About Default, System, and Standard Indexes
220.127.116.11. Overview of Default Indexes
Table 9.1. Default Indexes
|cn||Improves the performance of the most common types of user directory searches.|
|givenname||Improves the performance of the most common types of user directory searches.|
|Improves the performance of the most common types of user directory searches.|
|mailHost||Used by a messaging server.|
|member||Improves Directory Server performance. This index is also used by the Referential Integrity Plug-in. See Section 3.6, “Maintaining Referential Integrity” for more information.|
|owner||Improves Directory Server performance. This index is also used by the Referential Integrity Plug-in. See Section 3.6, “Maintaining Referential Integrity” for more information.|
|see Also||Improves Directory Server performance. This index is also used by the Referential Integrity Plug-in. See Section 3.6, “Maintaining Referential Integrity” for more information.|
|sn||Improves the performance of the most common types of user directory searches.|
|telephoneNumber||Improves the performance of the most common types of user directory searches.|
|uid||Improves Directory Server performance.|
18.104.22.168. Overview of System Indexes
Table 9.2. System Indexes
22.214.171.124. Overview of Standard Indexes
id2entry.db4, exists by default in Directory Server; you do not need to generate it.
id2entry.db4contains the actual directory database entries. All other database files can be recreated from this one.
9.1.3. Overview of the Searching Algorithm
cn, common name, attribute) and a pointer to the entries corresponding to each value. Directory Serverprocesses a search request as follows:
- An LDAP client application sends a search request to the directory.
- The directory examines the incoming request to make sure that the specified base DN matches a suffix contained by one or more of its databases or database links.
- If they do match, the directory processes the request.
- If they do not match, the directory returns an error to the client indicating that the suffix does not match. If a referral has been specified in the
cn=config, the directory also returns the LDAP URL where the client can attempt to pursue the request.
- The Directory Server examines the search filter to see what indexes apply, and it attempts to load the list of entry IDs from each index that satisfies the filter. The ID lists are combined based on whether the filter used AND or OR joins.
- If the list of entry IDs is larger than the configured ID list scan limit or if there is no index, then the Directory Server searches every entry in the database. This is an unindexed search.
- The Directory Server reads every entry from the
id2entry.db4database or the entry cache for every entry ID in the ID list (or from the entire database for an unindexed search). The server then checks the entries to see if they match the search filter. Each match is returned as it is found.The server continues through the list of IDs until it has searched all candidate entries or until it hits one of the configured resource limits. (Resource limits are listed in Table 10.1, “Resource Limit Attributes”.)
NoteIt's possible to set separate resource limits for searches using the simple paged results control. For example, administrators can set high or unlimited size and look-through limits with paged searches, but use the lower default limits for non-paged searches.
9.1.4. Approximate Searches
- All of the query string codes match the codes generated in the entry string.
- All of the query string codes are in the same order as the entry string codes.
|Name in the Directory (Phonetic Code)||Query String (Phonetic code)||Match Comments|
|Alice B Sarette (ALS B SRT)||Alice Sarette (ALS SRT)||Matches. Codes are specified in the correct order.|
|Alice Sarrette (ALS SRT)||Matches. Codes are specified in the correct order, despite the misspelling of Sarette.|
|Surette (SRT)||Matches. The generated code exists in the original name, despite the misspelling of Sarette.|
|Bertha Sarette (BR0 SRT)||No match. The code BR0 does not exist in the original name.|
|Sarette, Alice (SRT ALS)||No match. The codes are not specified in the correct order.|
9.1.5. Indexing Performance
- The ID list structures, which were the province of the Directory Server back end and opaque to the storage manager.
- The storage manager structures (Btrees), which were opaque to the Directory Server back end code.
entire listhad changed. For a single ID that was inserted or deleted from an ID list, the corresponding number of bytes written to the transaction log was the maximum configured size for that ID list, about 8 kilobytes. Also, every database page on which the list was stored was marked as dirty, since the entire list had changed.
- For long ID lists, the number of bytes written to the transaction log for any update to the list is significantly reduced, from the maximum ID list size (8 kilobytes) to twice the size of one ID (4 bytes).
- For short ID lists, storage efficiency, and in most cases performance, is improved because only the storage manager meditate need to be stored, not the ID list metadata.
- The average number of database pages marked as dirty per ID insert or delete operation is very small because a large number of duplicate keys will fit into each database page.
9.1.6. Balancing the Benefits of Indexing
- Approximate indexes are not efficient for attributes commonly containing numbers, such as telephone numbers.
- Substring indexes do not work for binary attributes.
- Equality indexes should be avoided if the value is big (such as attributes intended to contain photographs or passwords containing encrypted data).
- Maintaining indexes for attributes not commonly used in a search increases overhead without improving global searching performance.
- Attributes that are not indexed can still be specified in search requests, although the search performance may be degraded significantly, depending on the type of search.
- The more indexes you maintain, the more disk space you require.
- The Directory Server receives an add or modify operation.
- The Directory Server examines the indexing attributes to determine whether an index is maintained for the attribute values.
- If the created attribute values are indexed, then the Directory Server generates the new index entries.
- Once the server completes the indexing, the actual attribute values are created according to the client request.
dn: cn=John Doe,ou=People,dc=example,dc=com objectclass: top objectClass: person objectClass: orgperson objectClass: inetorgperson cn: John Doe cn: John sn: Doe ou: Manufacturing ou: people telephoneNumber: 408 555 8834 description: Manufacturing lead for the Z238 line of widgets.
- Equality, approximate, and substring indexes for
cn(common name) and
- Equality and substring indexes for the telephone number attribute.
- Substring indexes for the description attribute.
- Create the
cnequality index entry for
- Create the appropriate
cnapproximate index entries for
- Create the appropriate
cnsubstring index entries for
- Create the
snequality index entry for
- Create the appropriate
snapproximate index entry for
- Create the appropriate
snsubstring index entries for
- Create the telephone number equality index entry for
408 555 8834.
- Create the appropriate telephone number substring index entries for
408 555 8834.
- Create the appropriate description substring index entries for
Manufacturing lead for the Z238 line of widgets. A large number of substring entries are generated for this string.