Chapter 8. Configuration

8.1. Using ModeShape

Using ModeShape within your application is actually quite straightforward, and with JCR 2.0 it is possible for your application to do everything using only the JCR 2.0 API. Your application will first obtain a javax.jcr.Repository instance, and will use that object to create sessions through which your application will read, modify, search, or monitor content in the repository.

8.2. ModeShape Configuration Options

The following options are available for configuring ModeShape:
  • Loading configuration from a file is conceptually the most straightforward and requires the least amount of Java code, but it does requires having a configuration file. This is easy, allows one to manage configurations in version control, enables your application to use only the standard JCR API, and will likely be the best approach for most applications. If you're not sure, use this approach.
  • Loading configuration from a repository is an advanced option allowing multiple JcrEngine instances (usually in different processes perhaps on different machines) to easily access a (shared) configuration.
Each of these approaches has different advantages.

8.3. Loading Configuration from a File

The modeshape-config.xml file in SOA_ROOT/jboss-as/server/PROFILE/deploy/modeshape-services.jar is used by default.
Here is an example configuration file used in the repository example covered in the ModeShape Getting Started Guide document, though it has been simplified for clarity:
<?xml version="1.0" encoding="UTF-8"?>
<configuration xmlns:mode="http://www.modeshape.org/1.0" xmlns:jcr="http://www.jcp.org/jcr/1.0">
  <!-- 
  Define the JCR repositories 
  -->
  <mode:repositories>
      <!-- 
      Define a JCR repository that accesses the 'Cars' source directly.
      This of course is optional, since we could access the same content through 'vehicles'.
      -->
      <mode:repository jcr:name="car repository" mode:source="Cars">
          <mode:options jcr:primaryType="mode:options">
              <mode:option jcr:name="jaasLoginConfigName" mode:value="modeshape-jcr"/>
          </mode:options>
          <mode:descriptors>
            <!-- 
            	This adds a JCR Repository descriptor named "myDescriptor" with a value of "foo".
            	So this code:
            	Repository repo = ...;
            	System.out.println(repo.getDescriptor("myDescriptor");

            	Will now print out "foo".
            -->
            <myDescriptor mode:value="foo" />
          </mode:descriptors>
          <!-- 
                Import the custom node types defined in the named files. The values
                can be an absolute path to a classpath resource, an absolute file system
                path, a relative path on the file system (relative to where the process was
                started from), or a resolvable URL. If more than one node type definition 
                file is needed, the files can be listed as a single comma-delimited string
                in the 'mode:resource' attribute of the 'jcr:nodeTypes' element, or listed 
                individually using multiple mode:resource child elements (as shown below).
            -->
          <jcr:nodeTypes>
	           <mode:resource>/org/example/my-node-types.cnd</mode:resource>
	           <mode:resource>/org/example/additional-node-types.cnd</mode:resource>
	        </jcr:nodeTypes>
      </mode:repository>
  </mode:repositories>
   <!-- 
   Define the sources for the content. These sources are directly accessible using the 
   ModeShape-specific Graph API.
   -->
   <mode:sources jcr:primaryType="nt:unstructured">
       <mode:source jcr:name="Cars" 
              mode:classname="org.modeshape.graph.connector.inmemory.InMemoryRepositorySource" 
              mode:retryLimit="3" mode:defaultWorkspaceName="workspace1">
	           <mode:predefinedWorkspaceNames>workspace2</mode:predefinedWorkspaceNames>
	           <mode:predefinedWorkspaceNames>workspace3</mode:predefinedWorkspaceNames>
       </mode:source>
   </mode:sources>
   <!-- 
   Define the sequencers. This is an optional section. For this example, we're not using any sequencers. 
   -->
   <mode:sequencers>
       <!--mode:sequencer jcr:name="Image Sequencer">
           <mode:classname>
           	org.modeshape.sequencer.image.ImageMetadataSequencer
           </mode:classname>
           <mode:description>Image metadata sequencer</mode:description>        
           <mode:pathExpression>/foo/source => /foo/target</mode:pathExpression>
           <mode:pathExpression>/bar/source => /bar/target</mode:pathExpression>
       </mode:sequencer-->
   </mode:sequencers>
   <mode:mimeTypeDetectors>
       <mode:mimeTypeDetector jcr:name="Detector" 
                             mode:description="Standard extension-based MIME type detector"/>
   </mode:mimeTypeDetectors>
</configuration>

Note

This is the recommended approach if your application uses the standard and implementation-independent RepositoryFactory mechanism to obtain the JCR Repository reference.

8.4. Loading Configuration from a Repository

The first step is to create and configure the RepositorySource instance that we'll use to access the repository where the configuration is stored. Then, create a JcrConfiguration instance and load from this source:
RepositorySource configSource = ...
JcrConfiguration config = new JcrConfiguration();
configuration.loadFrom(configSource);
The loadFrom(...) method can be called any number of times, but each time it is called it completely wipes out any current notion of the configuration and replaces it with the configuration found in the file.
There is an optional second parameter that defines the name of the workspace in the supplied source where the configuration content can be found. It is not needed if the workspace is the source's default workspace. There is an optional third parameter that defines the Path within the configuration repository identifying the parent node of the various configuration nodes. If not specified, it assumes "/". This makes it possible for the configuration content to be located at a different location in the hierarchical structure. (This is not often required, but it is very useful if you ModeShape configuration file is embedded within another XML file.)
Once the JcrConfiguration has been loaded from a RepositorySource, the JcrConfiguration instance can be used to modify the configuration and then save those changes back to the repository. This technique can be used to place a configuration into a repository (such as a database) for the first time:
RepositorySource configSource = ...	// a RepositorySource to an empty source
JcrConfiguration config = new JcrConfiguration();

// Bind the configuration to the repository source (which is initially empty)...
configuration.loadFrom(configSource);

// Now load a configuration from a file (or construct one programmatically) ...
String pathToFile = ... 
configuration.loadFrom(pathToFile);

// Now save the configuration into the source ...
configuration.save();
Now you can load this configuration in multiple processes, using the approach mentioned above.

Note

This is an advanced way of defining your configuration, so this is recommended only for those that are already very comfortable with ModeShape and its lower-level graph API and connector API.

8.5. JCR Repository Options

ModeShape JCR repositories have a number of behaviors that can be controlled from within the configuration. These are known as repository options, and all have sensible defaults. However, they do allow you to better configure the JCR repository instances to best suit your needs.
As mentioned earlier, these options can be set programmatically or within the configuration file. When setting up the configuration programmatically, the actual enum literal values must be used, and all values are String literals:
JcrConfiguration config = ...
config.repository("repository A")
     .setOption(JcrRepository.Option.JAAS_LOGIN_CONFIG_NAME, "modeshape-jcr");
When using a configuration file, you set the option within the "mode:options" fragment under the "mode:repository" section. Each option fragment looks similar to the following:
<mode:option jcr:name="jaasLoginConfigName" mode:value="modeshape-jcr"/>
where the "jcr:name" XML attribute value contains the lower-camel-case form of the option literal, and the "mode:value" XML attribute value contains the repository option value. In the example above, the "jaasLoginConfigName" is the option name, and "modeshape-jcr" is the option value. An alternative representation is to set the name using the XML element name and set the primary type with an XML attribute. Thus, this fragment is equivalent to the previous listing:
<jaasLoginConfigName jcr:primaryType="mode:option" mode:value="modeshape-jcr"/>
The following table describes all of the current repository options.

Table 8.1. JCR Repository Options

Option Description
jaasLoginConfigName The JAAS JAAS application configuration name that specifies which login module should be used to validate credentials. By default, "modeshape-jcr" is used. Set the option with an empty (zero-length) value to completely turn off JAAS authentication (see Section 8.10, “Available Security Providers” for details). The enumeration literal is Option.JAAS_LOGIN_CONFIG_NAME.
systemSourceName
The name of the source (and optionally the workspace in the source) where the "/jcr:system" branch should be stored. The format is "name of workspace@name of source", or "name of source" if the default workspace is to be used. If this option is not used, a transient in-memory source will be used. Note that all leading and trailing whitespaces is removed for both the source name and workspace name. Thus, a value of "@" implies a zero-length workspace name and zero-length source name. Also, any use of the '@' character in source and workspace names must be escaped with a preceding backslash.
The enumeration literal is Option.SYSTEM_SOURCE_NAME.
anonymousUserRoles A comma-delimited list of default roles provided for anonymous access. A null or empty value for this option means that anonymous access is disabled. The enumeration literal is Option.ANONYMOUS_USER_ROLES.
exposeWorkspaceNamesInDescription
A boolean flag that indicates whether a complete list of workspace names should be exposed in the custom repository descriptor "org.modeshape.jcr.api.Repository.REPOSITORY_WORKSPACES". If this option is set to true, then any code that can access the repository can retrieve a complete list of workspace names through the javax.jcr.Repository.getDescriptor(String) method without logging in. The default value is 'true', meaning that the descriptor is populated.
Since some ModeShape installations may consider the list of workspace names to be restricted information and limit the ability of some or all users to see a complete list of workspace names, this option can be set to "false" to disable this capability. If this option is set to "false", the "org.modeshape.jcr.api.Repository.REPOSITORY_WORKSPACES" descriptor will not be set.
The enumeration literal is Option.EXPOSE_WORKSPACE_NAMES_IN_DESCRIPTOR.
repositoryJndiLocation A string property that when specified tells the JcrEngine where to put the Repository in JNDI. Assumes that you have write access to the JNDI tree. If no value set, then the Repository will not be bound to JNDI. The enumeration literal is Option.REPOSITORY_JNDI_LOCATION.
queryExecutionEnabled A boolean flag that specifies whether this repository is expected to execute searches and queries. If client applications will never perform searches or queries, then maintaining the query indexes is an unnecessary overhead, and can be disabled. Note that this is merely a hint, and that searches and queries might still work when this is set to 'false'. The default is 'true', meaning that clients can execute searches and queries. The enumeration literal is Option.QUERY_EXECUTION_ENABLED.
queryIndexDirectory
The system may maintain a set of indexes that improve the performance of searching and querying the content. These size of these indexes depend upon the size of the content being stored, and thus may consume a significant amount of space. This option defines a location on the file system where this repository may (if needed) store indexes so they don't consume large amounts of memory.
If specified, the value must be a valid path to a writable directory on the file system. If the path specifies a non-existant location, the repository may attempt to create the missing directories. The path may be absolute or relative to the location where this VM was started. If the specified location is not a readable and writable directory (or cannot be created as such), then this will generate an exception when the repository is created.
The default value is null, meaning the search indexes may not be stored on the local file system and, if needed, will be stored within memory.
The enumeration literal is Option.QUERY_INDEX_DIRECTORY.
queryIndexesUpdatedSynchronously
An advanced boolean flag that specifies whether updates to the indexes (if used) should be made synchronously, meaning that a call to Session.save() will not return until the search indexes have been completely updated. The benefit of synchronous updates is that a search or query performed immediately after a save() will operate upon content that was just changed. The downside is that the save() operation will take longer.
With asynchronous updates, however, the only work done during a save() invocation is that required to persist the changes in the underlying repository source, while changes to the search indexes are made in a different thread that may not run immediately. In this case, there may be an indeterminate lag before searching or querying after a save() will operate upon the changed content.
The default is value 'false', meaning the updates are performed asynchronously.
The enumeration literal is Option.QUERY_INDEXES_UPDATED_SYNCHRONOUSLY.
queryIndexesRebuiltSynchronously
An advanced boolean flag that specifies whether the indexes should be rebuilt synchronously when the repository restarts. If this flag is set to 'true', query indexes for each workspace in the repository will be rebuilt synchronously the first time that the repository is accessed (e.g., at the first login). If this flag is set to 'false', the query indexes for each workspace in the repository will be rebuilt asynchronously.
Rebuilding the indexes synchronously can cause very significant latency in the initial repository access if the repository contains a significant amount of content that must be reindexed. Updating the indexes asynchronously eliminates this latency, but repository queries may generate inconsistent results while the indexes are being updated. That is, query results may refer to content that is no longer in the repository or may fail to include appropriate results for nodes that had been added to the repository.
The default is value 'true', meaning the rebuilds are performed synchronously.
The enumeration literal is Option.QUERY_INDEXES_REBUILT_SYNCHRONOUSLY.
rebuildQueryIndexOnStartup
An advanced setting that specifies the strategy used to determine which query indexes need to be rebuilt when the repository restarts. ModeShape currently supports two strategies:
  • A value of "always" dictates that the query index for every workspace in the repository will be rebuilt each time that the repository restarts. This can sharply increase the startup time for the repository, particularly if the queryIndexesRebuiltSynchronously option is set to 'true' (the default). However, this strategy ensures that any repository content that was modified outside of the repository (e.g., files in a FileSystemSource that were directly modified on the file system) are properly indexed.
  • A value of "ifMissing" indicates that indexes should only be rebuilt if they do not currently exist or are obviously invalid. This strategy is always the most appropriate strategy for non-clustered repositories with repository sources that provide exclusive control over content (e.g., the InfinispanSource, the JpaSource) as it greatly reduces repository startup time for repositories with significant amounts of content.
Note that repositories that do not configure the queryIndexDirectory option will always use an in-memory index. This type of index will not be persisted across repository restarts and will require ModeShape to rebuild the indexes each time the repository starts up even if the "ifMissing" strategy is specified.
The "always" strategy is used by default and in cases where the option's value does not case-independently match the one of these two values. This was the only strategy available prior to ModeShape 2.8.1.GA.
The enumeration literal is Option.QUERY_INDEXES_REBUILT_SYNCHRONOUSLY, and the values are RebuildQueryIndexOnStartupOption.ALWAYS and RebuildQueryIndexOnStartupOption.IF_MISSING.
projectNodeTypes An advanced boolean flag that defines whether or not the node types should be exposed as content under the "/jcr:system/jcr:nodeTypes" node. Value is either "true" or "false" (default). The enumeration literal is Option.PROJECT_NODE_TYPES.
readDepth An advanced integer flag that specifies the depth of the subgraphs that should be loaded from the connectors during normal read operations. The default value is 1. The enumeration literal is Option.READ_DEPTH.
indexReadDepth An advanced integer flag that specifies the depth of the subgraphs that should be loaded from the connectors during indexing operations. The default value is 4. The enumeration literal is Option.INDEX_READ_DEPTH.
tablesIncludeColumnsForInheritedProperties
An advanced boolean flag that dictates whether the property definitions inherited from supertypes should be represented in the corresponding queryable table with columns. The JCR specification gives implementations some flexibility, so ModeShape allows this to be controlled.
When this option is set to "false", then each table has only those columns representing the (single-valued) property definitions explicitly defined by the node type. When this option is set to "true" (the default), each table will contain columns for each of the (single-valued) property definitions explicitly defined on the node type and inherited by the node type from all of the supertypes.
The enumeration literal is Option.TABLES_INCLUDE_COLUMNS_FOR_INHERITED_PROPERTIES.
performReferentialIntegrityChecks
An advanced boolean flag that specifies whether referential integrity checks should be performed upon Session.save(). If set to "true" (the default), referential integrity checks are performed to ensure that nodes referenced by other nodes cannot be removed. If the value is set to "false", then these referential integrity checks will not be performed when removing nodes.
Many people generally discourage the use of REFERENCE properties because of the overhead and the need for referential integrity. These concerns are somewhat mitigated by the introduction in JCR 2.0 of the WEAKREFERENCE property type, which are excluded from referential integrity checks.
This option is available for those cases where REFERENCE properties are not used within your content, and thus the referential integrity checks will never find violations. In these cases, you may disable these checks to slightly improve performance of delete operations.
The enumeration literal is Option.PERFORM_REFERENTIAL_INTEGRITY_CHECKS.
versionHistoryStructure
An advanced flag that specifies the structure used to store version histories under the "/jcr:system/jcr:versionStorage" branch. The JCR 2.0 specification does not predefine any particular structure, but ModeShape supports two types:
  • A value of "flat" dictates that all "nt:versionHistory" nodes are stored with a name matching the UUID of the versioned node and directly under the "/jcr:system/jcr:versionStorage" node. For example, given a "mix:versionable" node with the UUID fae2b929-c5ef-4ce5-9fa1-514779ca0ae3, the corresponding " nt:versionHistory" node will be at "/jcr:system/jcr:versionStorage/fae2b929-c5ef-4ce5-9fa1-514779ca0ae3".
  • A value of "hierarchical" dictates that all "nt:versionHistory" nodes are stored under a hierarchical structure created by the first 8 characters of the UUID string. For example, given a "mix:versionable" node with the UUID fae2b929-c5ef-4ce5-9fa1-514779ca0ae3, the corresponding "nt:versionHistory" node will be at "/jcr:system/jcr:versionStorage/fa/e2/b9/29/c5ef-4ce5-9fa1-514779ca0ae3.
The "hierarchical" structure is used by default and in cases where the option's value does not case-independently match the one of these two values.
The enumeration literal is Option.VERSION_HISTORY_STRUCTURE, and the values are VersionHistoryOption.FLAT and VersionHistoryOption.HIERARCHICAL.
removeDerivedContentWithOriginal
An advanced boolean flag that dictates whether content derived from other content (e.g., that output by sequencers) should be automatically (re)moved when the content from which it was derived is (re)moved from the repository. For example, consider that a file is uploaded and sequenced, and that the content derived from the file is stored in the repository. When that file is (re)moved, this option dictates whether the derived content should also be (re)moved automatically.
By default this option has a value of "true", ensuring that all derived content is deleted whenever the original content is deleted. A value of "false" will leave the derived content.
The enumeration literal is Option.REMOVE_DERIVED_CONTENT_WITH_ORIGINAL.
useAnonymousAccessOnFailedLogin
A boolean flag that indicates whether any failed, non-anonymous login attempts will automatically cause the Session to be created using the anonymous context. If anonymous logins are not enabled (with the anonymousUserRoles option), then the login will still fail.
By default this option has a value of "false", ensuring that non-anonymous login attempts either succeed as the requested user or fail.
The enumeration literal is Option.USE_ANONYMOUS_ACCESS_ON_FAILED_LOGIN.
useSecurityContextCredentials Older versions of ModeShape allowed client applications to pass in Credentials implementations that had a getSecurityContext() method that returned a SecurityContext object, which ModeShape would then use for authorization. However, since ModeShape now provides support for customized authentication and authorization modules, this is no longer needed and has been deprecated. If, however, your applications were written to use this SecurityContextCredentials implementation, then you can enable this option to turn the old behavior back on. Note, however, that this option will be removed in the next major release. Value is either "true" or "false" (default). The enumeration literal is Option.USE_SECURITY_CONTEXT_CREDENTIALS.

Warning

Setting the useAnonymousAccessOnFailedLogin option to "true" and setting the anonymousUserRoles to a valid value means that all login attempts will succeed, but named login attempts may actually succeed in an anonymous context. You can programattically determine which context is being used by checking the value of Session.getUserID().

8.6. Repository System Content

Each JCR repository contains information about the system in the "/jcr:system" area of the repository content. All of this system content applies to the whole repository (e.g., namespaces, node types, locks, versions, etc.) and therefore every session for each workspace sees the exact same "/jcr:system" content.
ModeShape implements this behavior by storing all "/jcr:system" content in a separate workspace, and then using federation to project that content into each workspace. This ensures that all workspaces see the same content, without having to duplicate the "/jcr:system" content in each workspace and ensure those copies stay in sync. Federation is better than duplication.
By default, ModeShape creates this separate system workspace in a transient, in-memory store. This works great for some simplistic cases, but this does not work when using clustering (see Section 8.13, “Clustering with ModeShape”), , or dynamically registering namespaces or adding or changing node types. This is because these features all rely upon changing or adding content in the "/jcr:system" area. For example, version histories are stored under "/jcr:system/jcr:versionStorage", node types under "/jcr:system/jcr:versionStorage", and namespaces under "/jcr:system/mode:namespaces".
In these situations, it is necessary to persist the system content in a repository source, and if clustering is enabled this source needs to be accessible to all members of the cluster. Many times, the easiest approach is to define an extra workspace in your repository source where the system content can be stored. It's also possible to define a separate repository source with a separate workspace for each repository's system content. (Using a separate source is required when the repository is using a single repository source that can only store limited kinds of nodes, like the file system connector or Subversion connector that can only store nt:file and nt:folder nodes.)
You should always configure each ModeShape repository with a source for its system workspace by using the SYSTEM_SOURCE_NAME repository option with a value that defines the name of source and name of the workspace in that source where the system content should be stored, in the format:
  workspaceName@sourceName
This specifies the system content should be stored in the workspace named "workspaceName" in the "sourceName" repository source.
The system content can be stored in any repository source capable of storing any content and, in the case of clustering, that is accessible across multiple processes. For most people, this will mean a relational database.

8.7. Example: Defining a Source for System Content

The following is an abbreviated example of an XML configuration that defines a source for the system content (in a MySQL database) and a repository that uses it:
<?xml version="1.0" encoding="UTF-8"?>
<configuration xmlns:mode="http://www.modeshape.org/1.0" 
	             xmlns:jcr="http://www.jcp.org/jcr/1.0">
  <mode:repositories>
    <mode:repository jcr:name="car repository" mode:source="Cars">
      <mode:options jcr:primaryType="mode:options">
        <!-- Explicitly specify the "system" workspace in the "SystemStore" source. -->
        <systemSourceName jcr:primaryType="mode:option" 
	                           mode:value="system@SystemStore"/>
        ...
      </mode:options>
      ...
    </mode:repository>
    ...
  </mode:repositories>
  <mode:sources jcr:primaryType="nt:unstructured">
    <!-- One source for the "/jcr:system" content ... -->
    <mode:source jcr:name="SystemStore" 
                 mode:classname="org.modeshape.connector.store.jpa.JpaSource"
                 mode:description="The database store for our system content"
                 mode:dialect="org.hibernate.dialect.MySQLDialect"
                 mode:dataSourceJndiName="java:/MyDataSource"
                 mode:defaultWorkspaceName="system"
                 mode:autoGenerateSchema="validate"/>    
    </mode:sources>
    <!-- An another source for the regular content ... -->
    <mode:source jcr:name="Cars" 
                 mode:classname="org.modeshape.connector.store.jpa.JpaSource"
                 mode:description="The database store for our system content"
                 mode:dialect="org.hibernate.dialect.MySQLDialect"
                 mode:dataSourceJndiName="java:/MyDataSource"
                 mode:defaultWorkspaceName="workspace1"
                 mode:autoGenerateSchema="validate">
      <mode:predefinedWorkspaceNames>workspace1</mode:predefinedWorkspaceNames>
      <mode:predefinedWorkspaceNames>workspace2</mode:predefinedWorkspaceNames>
      <mode:predefinedWorkspaceNames>workspace3</mode:predefinedWorkspaceNames>
    </mode:sources>
    ...
  </mode:sources>
  ...
</configuration>
Of course, you can always use a separate workspace in your primary source, too:
<?xml version="1.0" encoding="UTF-8"?>
<configuration xmlns:mode="http://www.modeshape.org/1.0" xmlns:jcr="http://www.jcp.org/jcr/1.0">
  <mode:repositories>
    <mode:repository jcr:name="car repository" mode:source="Cars">
      <mode:options jcr:primaryType="mode:options">
        <!-- Explicitly specify the "system" workspace in the "Cars" source. -->
        <systemSourceName jcr:primaryType="mode:option" mode:value="system@Cars"/>
        ...
      </mode:options>
      ...
    </mode:repository>
    ...
  </mode:repositories>
  <mode:sources jcr:primaryType="nt:unstructured">
    <!-- 
    Define one source for the regular content with a special workspace for the system content.
    -->
    <mode:source jcr:name="Cars" 
                 mode:classname="org.modeshape.connector.store.jpa.JpaSource"
                 mode:description="The database store for our system content"
                 mode:dialect="org.hibernate.dialect.MySQLDialect"
                 mode:dataSourceJndiName="java:/MyDataSource"
                 mode:defaultWorkspaceName="workspace1"
                 mode:autoGenerateSchema="validate">
      <mode:predefinedWorkspaceNames>workspace1</mode:predefinedWorkspaceNames>    
      <mode:predefinedWorkspaceNames>workspace2</mode:predefinedWorkspaceNames>    
      <mode:predefinedWorkspaceNames>workspace3</mode:predefinedWorkspaceNames>    
      <mode:predefinedWorkspaceNames>system</mode:predefinedWorkspaceNames>    
    </mode:sources>
    ...
  </mode:sources>
  ...
</configuration>

8.8. Query Index Directory

ModeShape maintains a set of index files that are used to process queries and searches, using the Lucene search engine. By default, these indexes are kept in memory (primarily because it is easy to configure). But most production configurations should not store them in-memory but should instead store these index files on the local file system.
Each ModeShape repository can be configured where the indexes should be stored, using the "QUERY_INDEX_DIRECTORY" repository option (see JcrRepository.Option) when using the programmatic API or the "queryIndexDirectory" repository option in a ModeShape configuration file. The value of this setting should be the absolute or relative path to the folder where the indexes should be stored. In this directory, ModeShape will store the index files for each workspace in a folder named similarly to the workspace. Note that ModeShape will dynamically create these workspace folders as required.
For example, here is part of a ModeShape configuration file that specifies these index files should be stored in the "data/car_repository/indexes" folder, relative to the folder where the JVM process was started:
<?xml version="1.0" encoding="UTF-8"?>
<configuration xmlns:mode="http://www.modeshape.org/1.0" 
	             xmlns:jcr="http://www.jcp.org/jcr/1.0">
  <mode:repositories>
    <mode:repository jcr:name="car repository" mode:source="Cars">
      <mode:options jcr:primaryType="mode:options">
        <!-- Explicitly specify the directory where the index files should be stored. -->
        <queryIndexDirectory jcr:primaryType="mode:option" 
	                           mode:value="data/car_repository/indexes"/>
        ...
      </mode:options>
      ...
    </mode:repository>
    ...
  </mode:repositories>
  ...
</configuration>

8.9. Security Modules

ModeShape 2.6 introduced pluggable authentication and authorization modules. Several modules are included and configured out-of-the-box, but it is now possible to implement and configure customized authentication and authorization logic. This section describes how these modules work, what's there out-of-the-box, and how to implement and add your own modules.
The AuthenticationProvider interface defines a single method:
public interface AuthenticationProvider {

  /**
   * Authenticate the user that is using the supplied credentials. If the supplied
   * credentials are authenticated, this method should construct an ExecutionContext 
   * that reflects the authenticated environment, including the context's valid
   * SecurityContext that will be used for authorization throughout the Session.
   * <p>
   * Note that each provider is handed a map into which it can place name-value 
   * pairs that will be used in the Session attributes of the Session that results
   * from this authentication attempt. ModeShape will ignore any attributes if 
   * this provider does not authenticate the credentials.
   * </p>
   * 
   * @param credentials the user's JCR credentials, which may be an 
   *  AnonymousCredentials if authenticating as an anonymous user
   * @param repositoryName the name of the JCR repository; never null
   * @param workspaceName the name of the JCR workspace; never null
   * @param repositoryContext the execution context of the repository, which 
   * may be wrapped by this method
   * @param sessionAttributes the map of name-value pairs that will be placed 
   *  into the Session's attributes; never null
   * @return the execution context for the authenticated user, or null if 
   * this provider could not authenticate the user
   */
  ExecutionContext authenticate( Credentials credentials,
                                 String repositoryName,
                                 String workspaceName,
                                 ExecutionContext repositoryContext,
                                 Map<String,Object> sessionAttributes );

}
When a client calls one of the Repository login methods, ModeShape calls the authenticate method on each of the AuthenticationProvider implementations registered with the Repository. As soon as one provider returns a non-null ExecutionContext, the caller is authenticated and ModeShape uses that ExecutionContext within the resulting Session.
When the client uses the Session and attempts to perform actions on the content, ModeShape uses the ExecutionContext's SecurityContext to determine whether the user has the necessary privileges. If the SecurityContext object implements the AuthorizationProvider interface, then ModeShape will call the hasPermission(...) method, passing in the ExecutionContext, the repository name, the name of the source used for the repository, the workspace name, the path of the node upon which the actions are being applied, and the array of actions (see ModeShapePermissions for the possible values):
public interface AuthorizationProvider {

  /**
   * Determine if the supplied execution context has permission for all of the 
   * named actions in the named workspace. If not all actions are allowed, the
   * method returns false.
   * 
   * @param context the context in which the subject is performing the 
   *        actions on the supplied workspace
   * @param repositoryName the name of the repository containing the 
   *        workspace content
   * @param repositorySourceName the name of the repository's source
   * @param workspaceName the name of the workspace in which the path exists
   * @param path the path on which the actions are occurring
   * @param actions the list of ModeShapePermissions actions to check
   * @return true if the subject has privilege to perform all of the named 
   *         actions on the content at the supplied path in the
   *         given workspace within the repository, or false otherwise
   */
  boolean hasPermission( ExecutionContext context,
                         String repositoryName,
                         String repositorySourceName,
                         String workspaceName,
                         Path path,
                         String... actions );
}
If the SecurityContext does not implement AuthorizationProvider, then ModeShape uses role-based authorization by mapping the actions into roles and then for each role calling the SecurityContext.hasRole(...) method on SecurityContext (see Section 3.5, “Security” ). Only if all of these invocations returns true will the operation be allowed to continue.

8.10. Available Security Providers

ModeShape comes with several AuthorizationProvider implementations that are automatically configured with every Repository, depending upon other settings and options. These providers are as follows:
  • JaasProvider uses JAAS (see Section 3.6, “JAAS and ModeShape”) for all authentication and role-based authorization. This provider authenticates clients that login to the Repository with a SimpleCredentials object, where the username and password match that in the JAAS policy, or a JaasCredentials constructed with a specific and already-authenticated JAAS LoginContext. This provider can be disabled by setting the jaasLoginConfigName configuration options (see Section 8.5, “JCR Repository Options”) to an empty (i.e., zero-length) value; otherwise, the option defines the name of the JAAS login configuration and will default to "modeshape-jcr" if not explicitly set. (This provider also works in some J2EE containers, in which the JAAS Subject is not available via the standard JAAS API and instead requires use of the JACC API, which many J2EE containers support)
  • SeamSecurityProvider delegates all authentication and role-based authorization to the Seam Security framework. This provider authenticates clients that login to the Repository with no need to pass a Credentials object. Note this does require obtaining a session for each servlet request, which is actually how the JCR API was intended to be used within web applications. This provider is automatically enabled when the Seam Security Identity class is found on the classpath.
  • ServletProvider delegates all authentication and role-based authorization to the servlet framework. This provider authenticates clients that login to the Repository with a ServletCredentials object, which can be constructed with the HttpServletRequest. Note this does require obtaining a session for each servlet request, which is actually how the JCR API was intended to be used within web applications. This provider is automatically enabled when the HttpServletSession class is found on the classpath.
  • AnonymousProvider will allow clients without Credentials to operate upon the repository, and will use role-based authorization based upon the roles defined by the anonymousUserRoles configuration option (see Section 8.5, “JCR Repository Options”). This provider authenticates clients that provide an AnonymousCredentials to the Repository's login(...) methods or use one of the login(...) methods that does not take a Credentials object.

Note

The SecurityContextProvider is also configured only when the useSecurityContextCredentials configuration option (see Section 8.5, “JCR Repository Options”) is set to 'true'. This provider authenticates clients that pass a SecurityContextCredentials object, and delegates all authentication to the embedded SecurityContext. This deprecated approach not enabled by default, and will be removed in the next major release of ModeShape. It remains in place to enable applications that use this approach to upgrade to ModeShape 2.6 (or later) without breaking their authentication mechanism.

8.11. Custom Providers

It is possible to provide your own authentication and authorization logic by providing one (or more) classes that implements the AuthorizationProvider interface, specifying the names of these classes in the configuration (see below), and making the classes available on the correct classpath.
Implementing the AuthorizationProvider interface is pretty straightforward. Your class needs a no-arg constructor, and the authenticate method must authenticate the credentials for the named repository and workspace. If the credentials are not authenticated, return null. Otherwise, create an ExecutionContext instance (from the ExecutionContext supplied in the repositoryContext parameter) to contain an appropriate SecurityContext instance for the authenticated user. As mentioned above, the SecurityContext should also implement the AuthorizationProvider interface for non-role-based authorization.

8.12. Example: Implement a Custom Provider

For example, let's imagine that our JCR application has its own authentication and authorization system. We can integrate with that by creating a new Credentials implementation called MyAppCredentials to encapsulate any information needed by the authentication/authorization system, which we'll assume is accessed by a singleton class SecurityService. We can then implement AuthenticationProvider as follows:
public class MyAppAuthorizationProvider implements AuthorizationProvider {

  private String appName;
  
  /**
   * Any public JavaBean properties can be set in the configuration
   */
  public void setApplicationName( String appName ) {
    this.appName = appName;
  }

  /**
   * Authenticate the user that is using the supplied credentials. If the supplied
   * credentials are authenticated, this method should construct an ExecutionContext
   * that reflects the authenticated environment, including the context's valid
   * SecurityContext that will be used for authorization throughout the Session.
   * <p>
   * Note that each provider is handed a map into which it can place name-value 
   * pairs that will be used in the Session attributes of the Session that results
   * from this authentication attempt. ModeShape will ignore any attributes if 
   * this provider does not authenticate the credentials.
   * </p>
   * 
   * @param credentials the user's JCR credentials, which may be an 
   *  AnonymousCredentials if authenticating as an anonymous user
   * @param repositoryName the name of the JCR repository; never null
   * @param workspaceName the name of the JCR workspace; never null
   * @param repositoryContext the execution context of the repository, which 
   * may be wrapped by this method
   * @param sessionAttributes the map of name-value pairs that will be placed 
   *  into the Session's attributes; never null
   * @return the execution context for the authenticated user, or null if 
   * this provider could not authenticate the user
   */
  public ExecutionContext authenticate( Credentials credentials,
                                        String repositoryName,
                                        String workspaceName,
                                        ExecutionContext repositoryContext,
                                        Map<String,Object> sessionAttributes );
    if ( credentials instanceof MyAppCredentials ) {
      // Try to authenticate ...
      MyAppCredentials appCreds = (MyAppCredentials)credentials;
      String user = appCreds.getUser();
      Object token = appCreds.getToken();
      AppCreds creds = SecurityService.login(appName,user,token);
      if ( creds != null ) {
        // We're in ...
        SecurityContext securityContext = new MyAppSecurityContext(creds);
        return repositoryContext.with(securityContext);
      }
    }
    return null;    
  }
}
where the MyAppSecurityContext is as follows:
public class MyAppSecurityContext 
            implements SecurityContext, AuthorizationProvider {
  private final AppCreds creds;
  public MyAppSecurityContext( AppCreds creds ) {
    this.creds = creds;
  }
 	
  /**
   * {@inheritDoc SecurityContext#getUserName()}
   * 
   * @see SecurityContext#getUserName()
   */
  public final String getUserName() {
      return creds.getUser();
  }

  /**
   * {@inheritDoc SecurityContext#hasRole(String)}
   * 
   * @see SecurityContext#hasRole(String)
   */
  public final boolean hasRole( String roleName ) {
      // shouldn't be called since we've implemented AuthorizationProvider
      return false;
  }

  /**
   * {@inheritDoc}
   * 
   * @see org.modeshape.graph.SecurityContext#logout()
   */
  public void logout() {
      creds.logout();
  }
  
  /**
   * {@inheritDoc}
   * 
   * @see org.modeshape.jcr.security.AuthorizationProvider.hasPermission
   */
  public boolean hasPermission( ExecutionContext context,
                           String repositoryName,
                           String repositorySourceName,
                           String workspaceName,
                           Path path,
                           String... actions ) {
    // This is imaginary and simplistic, but you'd implement any authorization logic here ...
    return this.creds.isAuthorized(repositoryName,workspaceName,path);
  }
}
Then we just need to configure the Repository to use this provider. In the ModeShape configuration files, there is an optional "mode:authenticationProviders" child element of "mode:repository", and within this fragment you can define zero or more authentication providers by specifying a name, the class, an optional description, and optionally any bean properties that should be called upon instantiation. (Note that the class will be instantiated only once per Repository instance). Here's an example configuration file:
<?xml version="1.0" encoding="UTF-8"?>
<configuration xmlns:mode="http://www.modeshape.org/1.0" 
	             xmlns:jcr="http://www.jcp.org/jcr/1.0">
  <mode:repositories>
    <mode:repository jcr:name="MyApp Repository" mode:source="Store">
      ...
      <mode:authenticationProviders>
        <!-- Specify the providers in a manner similar to sequencer 
             definitions are defined -->
        <mode:authenticationProvider jcr:name="CustomProviderA" 
                    mode:classname="org.example.MyAppAuthorizationProvider">
          <mode:description>My authentication provider</mode:description>
          <!-- Set JavaBean properties on provider if needed -->
          <mode:appName>MyAppName</mode:appName>
        </mode:authenticationProvider>
        ...
      </mode:authenticationProviders>
      ...
    </mode:repository>
    ...
  </mode:repositories>
  ...
</configuration>

8.13. Clustering with ModeShape

ModeShape 2.1 introduced the ability to have a cluster of JcrEngine instances distributed across multiple processes while behaving as though everything was happening in a single process. With clusters, the workload can be distributed across multiple machines, increasing tolerance against failure while allowing ModeShape to scale out to handle more workload.
ModeShape clustering uses the powerful, flexible and mature JGroups library to handle all network communication within the cluster. JGroups provides a wealth of capabilities, including automatically detecting new engines in the cluster (called discovery), reliable multicast communication, and automatic determination of the master node in the cluster. JGroups has a flexible protocol stack, works across firewalls, WANs and LANs, and supports multiple transport protocols, failure detection, reliable unicast and multicast message transmission, and encryption.
By default, clustering is not enabled. This means that each JcrEngine instance is self-contained and will not be aware of changes made in other JcrEngine instances. This is perfect in many lightweight or embedded scenarios, because it does not introduce any overhead associated with network communication.
However, clustering ModeShape is very easy and requires only a few simple steps:
  1. Enable clustering in the ModeShape configuration (more on this in a bit).
  2. Include the modeshape-clustering module in your application by JAR file.
  3. Start (or deploy) multiple JcrEngine instances using the same configuration. For embedded scenarios, this means instantiating multiple JcrEngine instances in multiple processes. In other cases, this means deploying ModeShape to multiple servers (either using the WebDAV server, REST server, or into JNDI and using with your own applications).
Your JCR-based application does not need to change in any other ways. Any EventListener implementations registered in Sessions on any of the engines will be notified of all events, regardless of whether those events were due to changes in the local or remote engines.
It also does not matter how many Repository instances are defined in the configuration and managed by each JcrEngine instance: each engine in the cluster can manage multiple named repositories. ModeShape ensures that all Sessions for a named repository see the changes made to that repository, regardless of where those sessions are located in the cluster. Likewise, those same changes will not be visible to the sessions for any other named repository.

8.14. Enabling Clustering in ModeShape

A ModeShape configuration can have a "clustering" fragment that defines the name of the cluster and the JGroups configuration:
<mode:clustering clusterName="modeshape-cluster" configuration="jgroups-modeshape.xml" />
The "clusterName" is a string that is a logical name of the cluster; all engines connecting to the same name form a cluster. Any messages multicast from one engine in the cluster will be received by all other members of the cluster. Again, the cluster name is independent of the repositories managed by th
The "configuration" value is a string that is one of:
  • the absolute file system path to the file containing the JGroups XML configuration;
  • the relative file system path to the file containing the JGroups XML configuration, relative to the current working directory of the Java process;
  • the name of a resource on the classpath containing the JGroups XML configuration;
  • the URL that can be resolved to the JGroups XML configuration; or
  • the string representation of JGroups configuration, either in XML format or the older string format.
The format of this JGroups configuration will be described in Section 8.15, “JGroups Configuration”. If the "configuration" property is not given, ModeShape will use the default JGroups configuration (as defined by the specific JGroups version).

Note

Note that all engines in the cluster must have the same JGroups configuration. In fact, all engines in the cluster will almost always have exactly the same ModeShape configuration.
Here is an example of a "clustering" fragment defining a cluster named "modeshape-cluster" using the JGroups configuration defined in the "jgroups-modeshape.xml" file at the supplied URL:
<clustering clusterName="modeshape-cluster" 
	  configuration="file://some/path/jgroups-modeshape.xml" />
This next example uses the JGroups configuration defined in the "jgroups-modeshape.xml" resource file on the classpath (or as an absolute path on a *nix system):
<clustering clusterName="modeshape-cluster" 
	  configuration="/some/path/jgroups-modeshape.xml" />
Next is an example that specifies the JGroups configuration using the older string representation of the form:
<clustering clusterName="modeshape-cluster" 
	  configuration="PROTOCOL(param=value;param=value):PROTOCOL:PROTOCOL" />
Of course, the "configuration" property can be specified as a child element, too (line breaks added for readability):
<clustering clusterName="modeshape-cluster">
	     <configuration>UDP(max_bundle_size="60000":max_bundle_timeout="30"):
		                  PING(timeout="2000"):...</configuration>
</clustering>
And finally an example that specifies the JGroups configuration using the newer XML representation (line breaks added for readability):
<clustering clusterName="modeshape-cluster">
	 <configuration><![CDATA[<config><UDP max_bundle_size="60000" 
	      max_bundle_timeout="30".../><PING timeout="2000"/>...</config>]]>
	 </configuration>
</clustering>
Note that the this example uses a child XML element for the "configuration", along with a CDATA section, so that the XML configuration can be nested within the ModeShape configuration.

Warning

Remember to specify the system workspace name (see Section 8.6, “Repository System Content”) for each repository that is clustered.

8.15. JGroups Configuration

The JGroups configuration defines a protocol stack that is used for messaging, starting with the bottom-most protocol and ending with the top-most protocol.
An example of the recommended JGroups XML format follows:
<config>
   <UDP
        mcast_addr="${jgroups.udp.mcast_addr:228.10.10.10}"
        mcast_port="${jgroups.udp.mcast_port:45588}"
        discard_incompatible_packets="true"
        max_bundle_size="60000"
        max_bundle_timeout="30"
        ip_ttl="${jgroups.udp.ip_ttl:2}"
        enable_bundling="true"
        thread_pool.enabled="true"
        thread_pool.min_threads="1"
        thread_pool.max_threads="25"
        thread_pool.keep_alive_time="5000"
        thread_pool.queue_enabled="false"
        thread_pool.queue_max_size="100"
        thread_pool.rejection_policy="Run"
        oob_thread_pool.enabled="true"
        oob_thread_pool.min_threads="1"
        oob_thread_pool.max_threads="8"
        oob_thread_pool.keep_alive_time="5000"
        oob_thread_pool.queue_enabled="false"
        oob_thread_pool.queue_max_size="100"
        oob_thread_pool.rejection_policy="Run"/>
   <PING timeout="2000"
           num_initial_members="3"/>
   <MERGE2 max_interval="30000"
           min_interval="10000"/>
   <FD_SOCK/>
   <FD timeout="10000" max_tries="5" />
   <VERIFY_SUSPECT timeout="1500"  />
   <BARRIER />
   <pbcast.NAKACK
                  use_mcast_xmit="false" gc_lag="0"
                  retransmit_timeout="300,600,1200,2400,4800"
                  discard_delivered_msgs="true"/>
   <UNICAST timeout="300,600,1200,2400,3600"/>
   <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
                  max_bytes="400000"/>
   <VIEW_SYNC avg_send_interval="60000"   />
   <pbcast.GMS print_local_addr="true" join_timeout="3000"
               view_bundling="true"/>
   <FC max_credits="20000000"
                   min_threshold="0.10"/>
   <FRAG2 frag_size="60000"  />
   <pbcast.STATE_TRANSFER  />
</config>
For more details on how to configure the JGroups stack, see the JGroups Manual.

Note

JGroups is also used in Infinispan, JBoss EAP, and other open source projects, and many of the JGroups configurations will work with ModeShape deployed in those same environments. For example, this blog post describes how to configure JGroups with three autodiscovery options available on Amazon EC2.

8.16. Using ModeShape in Web Applications

Your web application or JBoss service can use one of the JCR Repository instances running inside the ModeShape service with a URL such as:
 jndi:jcr/local?repositoryName=repository
Be sure to use the correct repository name.
Since the JCR API JAR is on the global classpath, your web application can use the JCR API without having to include the JAR file in your application's WAR file. In fact, your application will likely get ClassCastExceptions if it does include the JCR API in its WAR file. Plus, if needed, your application can use ModeShape's "org.modeshape.jcr.api" extensions to the JCR API (again, on the global classpath), and should not need or use any of the classes or interfaces in the ModeShape implementation.

8.17. Configuring a Predefined Node Hierarchy

The SOA_ROOT/jboss-as/server/PROFILE/deploy/modeshape-services.jar/modeshape-initial-content.xml file is an optional XML file which can be added to the main ModeShape configuration file:
<mode:initialContent mode:workspaces="default" mode:applyToNewWorkspaces="true" mode:content="modeshape-initial-content.xml"/>
Its purpose is to allow users to configure, at repository startup, a predefined node hierarchy with which the repository will be pre-populated. In other words, once the repository has started up, the node hierarchy from the XML file will be already present in the repository. The name of the XML element will be the name of the node, while the XML structure itself (the nested elements) will define the hierarchy.
To define a specific JCR type for a node (or for that matter any other valid JCR property), one needs to define the JCR namespace:
<files xmlns:jcr="http://www.jcp.org/jcr/1.0" jcr:primaryType="nt:folder" jcr:mixinTypes="mode:publishArea">
This shows the definition of a "files" node, of type "nt:folder" and which has the mixin "mode:publishArea".