Chapter 23. Remote Querying

23.1. Remote Querying

Red Hat JBoss Data Grid’s Hot Rod protocol allows remote, language neutral querying, using either the Infinispan Query Domain-specific Language (DSL) or the Ickle query language. Querying in either method allows remote, language-neutral querying, and is implementable in all languages currently available for the Hot Rod client.

The Infinispan Query Domain-specific Language

JBoss Data Grid uses its own query language based on an internal DSL. The Infinispan Query DSL provides a simplified way of writing queries, and is agnostic of the underlying query mechanisms. Additional information on the Infinispan Query DSL is available at The Infinispan Query DSL.

Ickle

Ickle is a string based query language allowing full-text and relational searches. Additional information on Ickle is available at Constructing Ickle Queries.

Protobuf Encoding

Google’s Protocol Buffers is used as an encoding format for both storing and querying data. The Infinispan Query DSL can be used remotely via the Hot Rod client that is configured to use the Protobuf marshaller. Protocol Buffers are used to adopt a common format for storing cache entries and marshalling them. Remote clients that need to index and query their stored entities must use the Protobuf encoding format. It is also possible to store Protobuf entities for the benefit of platform independence without indexing enabled if it is not required.

23.2. Querying Comparison

In Library mode, both Lucene Query-based and DSL querying is available. In Remote Client-Server mode, only Remote Querying using DSL is available. The following table is a feature comparison between Lucene Query-based querying, Infinispan Query DSL and Remote Querying.

Table 23.1. Embedded querying and Remote querying

FeatureLibrary Mode/Lucene QueryLibrary Mode/DSL QueryRemote Client-Server Mode/DSL QueryLibrary Mode/Ickle QueryRemote Client-Server Mode/Ickle Query

Indexing

Mandatory

Optional but highly recommended

Optional but highly recommended

Optional but highly recommended

Optional but highly recommended

Index contents

Selected fields

Selected fields

Selected fields

Selected fields

Selected fields

Data Storage Format

Java objects

Java objects

Protocol buffers

Java objects

Protocol buffers

Keyword Queries

Yes

No

No

Yes

Yes

Range Queries

Yes

Yes

Yes

Yes

Yes

Fuzzy Queries

Yes

No

No

Yes

Yes

Wildcard

Yes

Limited to like queries (Matches a wildcard pattern that follows JPA rules).

Limited to like queries (Matches a wildcard pattern that follows JPA rules).

Yes

Yes

Phrase Queries

Yes

No

No

Yes

Yes

Combining Queries

AND, OR, NOT, SHOULD

AND, OR, NOT

AND, OR, NOT

AND, OR, NOT

AND, OR, NOT

Sorting Results

Yes

Yes

Yes

Yes

Yes

Filtering Results

Yes, both within the query and as appended operator

Within the query

Within the query

Within the query

Within the query

Pagination of Results

Yes

Yes

Yes

Yes

Yes

Continuous Queries

No

Yes

Yes

No

No

Query Aggregation Operations

No

Yes

Yes

Yes

Yes

23.3. Performing Remote Queries via the Hot Rod Java Client

Remote querying over Hot Rod can be enabled once the RemoteCacheManager has been configured with the Protobuf marshaller.

The following procedure describes how to enable remote querying over its caches.

Prerequisites

RemoteCacheManager must be configured to use the Protobuf Marshaller.

Enabling Remote Querying via Hot Rod

  1. Add the infinispan-remote.jar

    The infinispan-remote.jar is an uberjar, and therefore no other dependencies are required for this feature.

  2. Enable indexing on the cache configuration

    Indexing is not mandatory for Remote Queries, but it is highly recommended because it makes searches on caches that contain large amounts of data significantly faster. Indexing can be configured at any time. Enabling and configuring indexing is the same as for Library mode.

    Add the following configuration within the cache-container element loated inside the Infinispan subsystem element.

    <!-- A basic example of an indexed local cache
        that uses the RAM Lucene directory provider -->
    <local-cache name="an-indexed-cache">
        <!-- Enable indexing using the RAM Lucene directory provider -->
        <indexing index="ALL">
            <property name="default.directory_provider">ram</property>
        </indexing>
    </local-cache>
  3. Register the Protobuf schema definition files

    Register the Protobuf schema definition files by adding them in the ___protobuf_metadata system cache. The cache key is a string that denotes the file name and the value is .proto file, as a string. Alternatively, protobuf schemas can also be registered by invoking the registerProtofile methods of the server’s ProtobufMetadataManager MBean. There is one instance of this MBean per cache container and is backed by the ___protobuf_metadata, so that the two approaches are equivalent.

    For an example of providing the protobuf schema via ___protobuf_metadata system cache, see Registering a Protocol Buffers schema file.

    Note

    Writing to the ___protobuf_metadata cache requires the ___schema_manager role be added to the user performing the write.

    The following example demonstrates how to invoke the registerProtofile methods of the ProtobufMetadataManager MBean.

    Registering Protobuf schema definition files via JMX

    import javax.management.MBeanServerConnection;
    import javax.management.ObjectName;
    import javax.management.remote.JMXConnector;
    import javax.management.remote.JMXServiceURL;
    
    ...
    
    String serverHost = ...         // The address of your JDG server
    int serverJmxPort = ...         // The JMX port of your server
    String cacheContainerName = ... // The name of your cache container
    String schemaFileName = ...     // The name of the schema file
    String schemaFileContents = ... // The Protobuf schema file contents
    
    JMXConnector jmxConnector = JMXConnectorFactory.connect(new JMXServiceURL(
        "service:jmx:remoting-jmx://" + serverHost + ":" + serverJmxPort));
    MBeanServerConnection jmxConnection = jmxConnector.getMBeanServerConnection();
    
    ObjectName protobufMetadataManagerObjName =
        new ObjectName("jboss.infinispan:type=RemoteQuery,name=" +
        ObjectName.quote(cacheContainerName) +
        ",component=ProtobufMetadataManager");
    
    jmxConnection.invoke(protobufMetadataManagerObjName,
                         "registerProtofile",
                         new Object[]{schemaFileName, schemaFileContents},
                         new String[]{String.class.getName(), String.class.getName()});
    jmxConnector.close();

Result

All data placed in the cache is immediately searchable, whether or not indexing is in use. Entries do not need to be annotated, unlike embedded queries. The entity classes are only meaningful to the Java client and do not exist on the server.

Once remote querying has been enabled, the QueryFactory can be obtained using the following:

Obtaining the QueryFactory

import org.infinispan.client.hotrod.Search;
import org.infinispan.query.dsl.QueryFactory;
import org.infinispan.query.dsl.Query;
import org.infinispan.query.dsl.SortOrder;
...
remoteCache.put(2, new User("John", 33));
remoteCache.put(3, new User("Alfred", 40));
remoteCache.put(4, new User("Jack", 56));
remoteCache.put(4, new User("Jerry", 20));

QueryFactory qf = Search.getQueryFactory(remoteCache);
Query query = qf.from(User.class)
    .orderBy("age", SortOrder.ASC)
    .having("name").like("J%")
    .and().having("age").gte(33)
    .build();

List<User> list = query.list();
assertEquals(2, list.size());
assertEquals("John", list.get(0).getName());
assertEquals(33, list.get(0).getAge());
assertEquals("Jack", list.get(1).getName());
assertEquals(56, list.get(1).getAge());

Queries can now be run over Hot Rod similar to Library mode.

23.4. Remote Querying in the Hot Rod C++ Client

For instructions on using remote querying in the Hot Rod C++ Client refer to Performing Remote Queries in the Hot Rod C++ Client.

23.5. Remote Querying in the Hot Rod C# Client

For instructions on using remote querying in the Hot Rod C# Client refer to Performing Remote Queries in the Hot Rod C# Client.

23.6. Protobuf Encoding

23.6.1. Protobuf Encoding

The Infinispan Query DSL can be used remotely via the Hot Rod client. In order to do this, protocol buffers are used to adopt a common format for storing cache entries and marshalling them.

For more information, see https://developers.google.com/protocol-buffers/docs/overview

23.6.2. Storing Protobuf Encoded Entities

Protobuf requires data to be structured. This is achieved by declaring Protocol Buffer message types in .proto files

For example:

.library.proto

package book_sample;
message Book {
    required string title = 1;
    required string description = 2;
    required int32 publicationYear = 3; // no native Date type available in Protobuf

    repeated Author authors = 4;
}
message Author {
    required string name = 1;
    required string surname = 2;
}

The provided example:

  1. An entity named Book is placed in a package named book_sample.

    package book_sample;
    message Book {
  2. The entity declares several fields of primitive types and a repeatable field named authors.

        required string title = 1;
        required string description = 2;
        required int32 publicationYear = 3; // no native Date type available in Protobuf
    
        repeated Author authors = 4;
    }
  3. The Author message instances are embedded in the Book message instance.

    message Author {
        required string name = 1;
        required string surname = 2;
    }

23.6.3. About Protobuf Messages

There are a few important things to note about Protobuf messages:

  • Nesting of messages is possible, however the resulting structure is strictly a tree, and never a graph.
  • There is no type inheritance.
  • Collections are not supported, however arrays can be easily emulated using repeated fields.

23.6.4. Using Protobuf with Hot Rod

Protobuf can be used with JBoss Data Grid’s Hot Rod using the following two steps:

  1. Configure the client to use a dedicated marshaller, in this case, the ProtoStreamMarshaller. This marshaller uses the ProtoStream library to assist in encoding objects.

    Important

    If the infinispan-remote jar is not in use, then the infinispan-remote-query-client Maven dependency must be added to use the ProtoStreamMarshaller.

  2. Instruct ProtoStream library on how to marshall message types by registering per entity marshallers.

Use the ProtoStreamMarshaller to Encode and Marshall Messages

import org.infinispan.client.hotrod.configuration.ConfigurationBuilder;
import org.infinispan.client.hotrod.marshall.ProtoStreamMarshaller;
import org.infinispan.protostream.FileDescriptorSource;
import org.infinispan.protostream.SerializationContext;
...
ConfigurationBuilder clientBuilder = new ConfigurationBuilder();
clientBuilder.addServer()
    .host("127.0.0.1").port(11234)
    .marshaller(new ProtoStreamMarshaller());

RemoteCacheManager remoteCacheManager = new RemoteCacheManager(clientBuilder.build());
SerializationContext serCtx =
    ProtoStreamMarshaller.getSerializationContext(remoteCacheManager);
serCtx.registerProtoFiles(FileDescriptorSource.fromResources("/library.proto"));
serCtx.registerMarshaller(new BookMarshaller());
serCtx.registerMarshaller(new AuthorMarshaller());
// Book and Author classes omitted for brevity

In the provided example,

  • The SerializationContext is provided by the ProtoStream library.
  • The SerializationContext.registerProtofile method receives the name of a .proto classpath resource file that contains the message type definitions.
  • The SerializationContext associated with the RemoteCacheManager is obtained, then ProtoStream is instructed to marshall the protobuf types.
Note

A RemoteCacheManager has no SerializationContext associated with it unless it was configured to use ProtoStreamMarshaller.

23.6.5. Registering Per Entity Marshallers

When using the ProtoStreamMarshaller for remote querying purposes, registration of per entity marshallers for domain model types must be provided by the user for each type or marshalling will fail. When writing marshallers, it is essential that they are stateless and threadsafe, as a single instance of them is being used.

The following example shows how to write a marshaller.

BookMarshaller.java

import org.infinispan.protostream.MessageMarshaller;
...
public class BookMarshaller implements MessageMarshaller<Book> {
    @Override
    public String getTypeName() {
        return "book_sample.Book";
    }
    @Override
    public Class<? extends Book> getJavaClass() {
        return Book.class;
    }
    @Override
    public void writeTo(ProtoStreamWriter writer, Book book) throws IOException {
        writer.writeString("title", book.getTitle());
        writer.writeString("description", book.getDescription());
        writer.writeCollection("authors", book.getAuthors(), Author.class);
    }
    @Override
    public Book readFrom(ProtoStreamReader reader) throws IOException {
        String title = reader.readString("title");
        String description = reader.readString("description");
        int publicationYear = reader.readInt("publicationYear");
        Set<Author> authors = reader.readCollection("authors",
            new HashSet<Author>(), Author.class);
        return new Book(title, description, publicationYear, authors);
    }
}

Once the client has been set up, reading and writing Java objects to the remote cache will use the entity marshallers. The actual data stored in the cache will be protobuf encoded, provided that marshallers were registered with the remote client for all involved types. In the provided example, this would be Book and Author.

Objects stored in protobuf format are able to be utilized with compatible clients written in different languages.

23.6.6. Indexing Protobuf Encoded Entities

You can configure indexing for caches on the JBoss Data Grid server after you configure the client to use Protobuf.

To index entries in a cache, JBoss Data Grid must have access to the message types defined in a Protobuf schema, which is a file with a .proto extension.

You provide JBoss Data Grid with a Protobuf schema by placing it in the ___protobuf_metadata cache with a put, putAll, putIfAbsent, or replace operation. Alternatively you can invoke the ProtobufMetadataManager MBean via JMX.

Both keys and values of the ___protobuf_metadata cache are Strings. The key is the file name and the value is contents of the schema file.

Note

Users that perform write operations to the ___protobuf_metadata cache require the ___schema_manager role.

Registering a Protocol Buffers schema file

import org.infinispan.client.hotrod.RemoteCache;
import org.infinispan.client.hotrod.RemoteCacheManager;
import org.infinispan.query.remote.client.ProtobufMetadataManagerConstants;

RemoteCacheManager remoteCacheManager = ... // obtain a RemoteCacheManager

// obtain the '__protobuf_metadata' cache
RemoteCache<String, String> metadataCache =
    remoteCacheManager.getCache(
        ProtobufMetadataManagerConstants.PROTOBUF_METADATA_CACHE_NAME);

String schemaFileContents = ... // this is the contents of the schema file
metadataCache.put("my_protobuf_schema.proto", schemaFileContents);

The ProtobufMetadataManager is a cluster-wide replicated repository of Protobuf schema definitions or[path].proto files. For each running cache manager, a separate ProtobufMetadataManager MBean instance exists, and is backed by the ___protobuf_metadata cache. The ProtobufMetadataManager ObjectName uses the following pattern:

<jmx domain>:type=RemoteQuery,
    name=<cache manager<methodname>putAllname>,
    component=ProtobufMetadataManager

The following signature is used by the method that registers the Protobuf schema file:

void registerProtofile(String name, String contents)

If indexing is enabled for a cache, all fields of Protobuf-encoded entries are indexed. All Protobuf-encoded entries are searchable, regardless of whether indexing is enabled.

Note

Indexing is recommended for improved performance but is not mandatory when using remote queries. Using indexing improves the searching speed but can also reduce the insert/update speeds due to the overhead required to maintain the index.

23.6.7. Controlling Field Indexing

After you enable indexing for a cache, all Protobuf type fields are indexed and stored by default. However, this indexing can degrade performance and result in inefficient querying for Protobuf message types that contain many fields or very large fields.

You can control which fields are indexed using the @Indexed and @Field annotations directly in the Protobuf schema in comment definitions on the last line of the comment before the message or field to annotate.

@Indexed
  • Applies to message types only.
  • Has a boolean value. The default value is true so specifying @Indexed has the same result as @Indexed(true). If you specify @Indexed(false) all field annotations are ignored and no fields are indexed.
  • Lets you specify the fields of the message type which are indexed. Using @Indexed(false) indicates that no fields are to be indexed. As a result, the @Field annotations are ignored.
@Field
  • Applies to fields only.
  • Has three attributes: index, store, and analyze. Each attribute can have a value of NO or YES.

    • index specifies if the field is indexed, which includes the field in indexed queries.
    • store specifies if the field is stored in the index, which allows the field to be used for projections.
    • analyze specifies if the field is included in full text searches.
  • Defaults to @Field(index=Index.YES, store=Store.NO, analyze=Analyze.NO).
  • Replaces the @IndexedField annotation.

    As of this release, @IndexedField is deprecated. If you include this annotation, JBoss Data Grid throws a warning message. You can replace @IndexedField annotations with @Field annotations as follows:

    • @IndexedField is equivalent to @Field(store=Store.YES)
    • @IndexedField(store=false) is equivalent to @Field
    • @IndexedField(index=false, store=false) is equivalent to @Field(index=Index.NO)
Important

If you specify the @Indexed and @Field annotations, you must include annotations for the message type and each field. Otherwise the entire message is not indexed.

23.6.7.1. Example of an Annotated Message Type

The following is an example of a message type that contains the @Indexed and @Field annotations:

/*
  This type is indexed but not all fields are indexed.
  @Indexed
*/
message Note {

  /*
    This field is indexed but not stored.
    @Field
  */
  optional string text = 1;

  /*
    This field is indexed and stored.
    @Field(store=Store.YES)
  */
  optional string author = 2;

  /*
    This field is stored but not indexed.
    @Field(index=Index.NO, store=Store.YES)
  */
  optional bool isRead = 3;

  /*
    This field is not indexed or stored.
    @Field(index=Index.NO)
  */
  optional int32 priority;
}

23.6.7.2. Disabling Indexing for All Protobuf Message Types

You can disable indexing for all Protobuf message types that are not annotated. Set the value of the indexed_by_default Protobuf schema option to false at the start of each schema file, as follows:

option indexed_by_default = false;  //Disable indexing of all types that are not annotated for indexing.

23.6.8. Defining Protocol Buffers Schemas With Java Annotations

You can declare Protobuf metadata using Java annotations. Instead of providing a MessageMarshaller implementation and a .proto schema file, you can add minimal annotations to a Java class and its fields.

The objective of this method is to marshal Java objects to protobuf using the ProtoStream library. The ProtoStream library internally generates the marshallar and does not require a manually implemented one. The Java annotations require minimal information such as the Protobuf tag number. The rest is inferred based on common sense defaults ( Protobuf type, Java collection type, and collection element type) and is possible to override.

The auto-generated schema is registered with the SerializationContext and is also available to the users to be used as a reference to implement domain model classes and marshallers for other languages.

The following are examples of Java annotations

User.Java

package sample;

import org.infinispan.protostream.annotations.ProtoEnum;
import org.infinispan.protostream.annotations.ProtoEnumValue;
import org.infinispan.protostream.annotations.ProtoField;
import org.infinispan.protostream.annotations.ProtoMessage;

@ProtoMessage(name = "ApplicationUser")
public class User {

    @ProtoEnum(name = "Gender")
    public enum Gender {
        @ProtoEnumValue(number = 1, name = "M")
        MALE,

        @ProtoEnumValue(number = 2, name = "F")
        FEMALE
    }

    @ProtoField(number = 1, required = true)
    public String name;

    @ProtoField(number = 2)
    public Gender gender;
}

Note.Java

package sample;

import org.infinispan.protostream.annotations.ProtoDoc;
import org.infinispan.protostream.annotations.ProtoField;

@ProtoDoc("@Indexed")
public class Note {

    private String text;

    private User author;

    @ProtoDoc("@Field")
    @ProtoField(number = 1)
    public String getText() {
        return text;
    }

    public void setText(String text) {
        this.text = text;
    }

    @ProtoDoc("@Field(store = Store.YES)")
    @ProtoField(number = 2)
    public User getAuthor() {
        return author;
    }

    public void setAuthor(User author) {
        this.author = author;
    }
}

ProtoSchemaBuilderDemo.Java

import org.infinispan.protostream.SerializationContext;
import org.infinispan.protostream.annotations.ProtoSchemaBuilder;
import org.infinispan.client.hotrod.RemoteCacheManager;
import org.infinispan.client.hotrod.marshall.ProtoStreamMarshaller;

...

RemoteCacheManager remoteCacheManager = ... // we have a RemoteCacheManager
SerializationContext serCtx =
    ProtoStreamMarshaller.getSerializationContext(remoteCacheManager);

// generate and register a Protobuf schema and marshallers based
// on Note class and the referenced classes (User class)
ProtoSchemaBuilder protoSchemaBuilder = new ProtoSchemaBuilder();
String generatedSchema = protoSchemaBuilder
    .fileName("sample_schema.proto")
    .packageName("sample_package")
    .addClass(Note.class)
    .build(serCtx);

// the types can be marshalled now
assertTrue(serCtx.canMarshall(User.class));
assertTrue(serCtx.canMarshall(Note.class));
assertTrue(serCtx.canMarshall(User.Gender.class));

// display the schema file
System.out.println(generatedSchema);

The following is the .proto file that is generated by the ProtoSchemaBuilderDemo.java example.

Sample_Schema.Proto

package sample_package;

 /* @Indexed */
message Note {

   /* @Field */
   optional string text = 1;

   /* @Field(store = Store.YES) */
   optional ApplicationUser author = 2;
}

message ApplicationUser {

   enum Gender {
      M = 1;
      F = 2;
   }

   required string name = 1;
   optional Gender gender = 2;
}

The following table lists the supported Java annotations with its application and parameters.

Table 23.2. Java Annotations

AnnotationApplies ToPurposeRequirementParameters

@ProtoDoc

Class/Field/Enum/Enum member

Specifies the documentation comment that will be attached to the generated Protobuf schema element (message type, field definition, enum type, enum value definition)

Optional

A single String parameter, the documentation text

@ProtoMessage

Class

Specifies the name of the generated message type. If missing, the class name if used instead

Optional

name (String), the name of the generated message type; if missing the Java class name is used by default

@ProtoField

Field, Getter or Setter

Specifies the Protobuf field number and its Protobuf type. Also indicates if the field is repeated, optional or required and its (optional) default value. If the Java field type is an interface or an abstract class, its actual type must be indicated. If the field is repeatable and the declared collection type is abstract then the actual collection implementation type must be specified. If this annotation is missing, the field is ignored for marshalling (it is transient). A class must have at least one @ProtoField annotated field to be considered Protobuf marshallable.

Required

number (int, mandatory), the Protobuf number type (org.infinispan.protostream.descriptors.Type, optional), the Protobuf type, it can usually be inferred required (boolean, optional)name (String, optional), the Protobuf namejavaType (Class, optional), the actual type, only needed if declared type is abstract collectionImplementation (Class, optional), the actual collection type, only needed if declared type is abstract defaultValue (String, optional), the string must have the proper format according to the Java field type

@ProtoEnum

Enum

Specifies the name of the generated enum type. If missing, the Java enum name if used instead

Optional

name (String), the name of the generated enum type; if missing the Java enum name is used by default

@ProtoEnumValue

Enum member

Specifies the numeric value of the corresponding Protobuf enum value

Required

number (int, mandatory), the Protobuf number name (String), the Protobuf name; if missing the name of the Java member is used

Note

The @ProtoDoc annotation can be used to provide documentation comments in the generated schema and also allows to inject the @Indexed and @Field annotations where needed. See Custom Fields Indexing with Protobuf for additional information.