Chapter 3. Querying remote caches

You can index and query remote caches on Data Grid Server.

3.1. Querying caches from Hot Rod Java clients

Data Grid lets you programmatically query remote caches from Java clients through the Hot Rod endpoint. This procedure explains how to index query a remote cache that stores Book instances.

Prerequisites

  • Add the ProtoStream processor to your pom.xml.

Data Grid provides this processor for the @ProtoField and @ProtoDoc annotations so you can generate Protobuf schemas and perform queries.

<dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>org.infinispan</groupId>
      <artifactId>infinispan-bom</artifactId>
      <version>${version.infinispan}</version>
      <type>pom</type>
    </dependency>
  </dependencies>
</dependencyManagement>

<dependencies>
  <dependency>
    <groupId>org.infinispan.protostream</groupId>
    <artifactId>protostream-processor</artifactId>
    <scope>provided</scope>
  </dependency>
</dependencies>

Procedure

  1. Add indexing annotations to your class, as in the following example:

    Book.java

    import org.infinispan.protostream.annotations.ProtoDoc;
    import org.infinispan.protostream.annotations.ProtoFactory;
    import org.infinispan.protostream.annotations.ProtoField;
    
    @ProtoDoc("@Indexed")
    public class Book {
    
       @ProtoDoc("@Field(index=Index.YES, analyze = Analyze.YES, store = Store.NO)")
       @ProtoField(number = 1)
       final String title;
    
       @ProtoDoc("@Field(index=Index.YES, analyze = Analyze.YES, store = Store.NO)")
       @ProtoField(number = 2)
       final String description;
    
       @ProtoDoc("@Field(index=Index.YES, analyze = Analyze.YES, store = Store.NO)")
       @ProtoField(number = 3, defaultValue = "0")
       final int publicationYear;
    
    
       @ProtoFactory
       Book(String title, String description, int publicationYear) {
          this.title = title;
          this.description = description;
          this.publicationYear = publicationYear;
       }
       // public Getter methods omitted for brevity
    }

  2. Implement the SerializationContextInitializer interface in a new class and then add the @AutoProtoSchemaBuilder annotation.

    1. Reference the class that includes the @ProtoField and @ProtoDoc annotations with the includeClasses parameter.
    2. Define a name for the Protobuf schema that you generate and filesystem path with the schemaFileName and schemaFilePath parameters.
    3. Specify the package name for the Protobuf schema with the schemaPackageName parameter.

      RemoteQueryInitializer.java

      import org.infinispan.protostream.SerializationContextInitializer;
      import org.infinispan.protostream.annotations.AutoProtoSchemaBuilder;
      
      @AutoProtoSchemaBuilder(
            includeClasses = {
                  Book.class
            },
            schemaFileName = "book.proto",
            schemaFilePath = "proto/",
            schemaPackageName = "book_sample")
      public interface RemoteQueryInitializer extends SerializationContextInitializer {
      }

  3. Compile your project.

    The code examples in this procedure generate a proto/book.proto schema and an RemoteQueryInitializerImpl.java implementation of the annotated Book class.

Next steps

Create a remote cache that configures Data Grid to index your entities. For example, the following remote cache indexes the Book entity in the book.proto schema that you generated in the previous step:

<replicated-cache name="books">
  <indexing>
    <indexed-entities>
      <indexed-entity>book_sample.Book</indexed-entity>
    </indexed-entities>
  </indexing>
</replicated-cache>

The following RemoteQuery class does the following:

  • Registers the RemoteQueryInitializerImpl serialization context with a Hot Rod Java client.
  • Registers the Protobuf schema, book.proto, with Data Grid Server.
  • Adds two Book instances to the remote cache.
  • Performs a full-text query that matches books by keywords in the title.

RemoteQuery.java

package org.infinispan;

import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;

import org.infinispan.client.hotrod.RemoteCache;
import org.infinispan.client.hotrod.RemoteCacheManager;
import org.infinispan.client.hotrod.Search;
import org.infinispan.client.hotrod.configuration.ConfigurationBuilder;
import org.infinispan.query.dsl.Query;
import org.infinispan.query.dsl.QueryFactory;
import org.infinispan.query.remote.client.ProtobufMetadataManagerConstants;

public class RemoteQuery {

   public static void main(String[] args) throws Exception {
      ConfigurationBuilder clientBuilder = new ConfigurationBuilder();
      // RemoteQueryInitializerImpl is generated
      clientBuilder.addServer().host("127.0.0.1").port(11222)
            .security().authentication().username("user").password("user")
            .addContextInitializers(new RemoteQueryInitializerImpl());

      RemoteCacheManager remoteCacheManager = new RemoteCacheManager(clientBuilder.build());

      // Grab the generated protobuf schema and registers in the server.
      Path proto = Paths.get(RemoteQuery.class.getClassLoader()
            .getResource("proto/book.proto").toURI());
      String protoBufCacheName = ProtobufMetadataManagerConstants.PROTOBUF_METADATA_CACHE_NAME;
      remoteCacheManager.getCache(protoBufCacheName).put("book.proto", Files.readString(proto));

      // Obtain the 'books' remote cache
      RemoteCache<Object, Object> remoteCache = remoteCacheManager.getCache("books");

      // Add some Books
      Book book1 = new Book("Infinispan in Action", "Learn Infinispan with using it", 2015);
      Book book2 = new Book("Cloud-Native Applications with Java and Quarkus", "Build robust and reliable cloud applications", 2019);

      remoteCache.put(1, book1);
      remoteCache.put(2, book2);

      // Execute a full-text query
      QueryFactory queryFactory = Search.getQueryFactory(remoteCache);
      Query<Book> query = queryFactory.create("FROM book_sample.Book WHERE title:'java'");

      List<Book> list = query.execute().list(); // Voila! We have our book back from the cache!
   }
}

Additional resources

3.2. Querying caches from Data Grid Console and CLI

Data Grid Console and the Data Grid Command Line Interface (CLI) let you query indexed and non-indexed remote caches. You can also use any HTTP client to index and query caches via the REST API.

This procedure explains how to index and query a remote cache that stores Person instances.

Prerequisites

  • Have at least one running Data Grid Server instance.
  • Have Data Grid credentials with create permissions.

Procedure

  1. Add indexing annotations to your Protobuf schema, as in the following example:

    package org.infinispan.example;
    
    /* @Indexed */
    message Person {
        /* @Field(index=Index.YES, store = Store.NO, analyze = Analyze.NO) */
        optional int32 id = 1;
    
        /* @Field(index=Index.YES, store = Store.YES, analyze = Analyze.NO) */
        required string name = 2;
    
        /* @Field(index=Index.YES, store = Store.YES, analyze = Analyze.NO) */
        required string surname = 3;
    
        /* @Field(index=Index.YES, store = Store.YES, analyze = Analyze.NO) */
        optional int32 age = 6;
    
    }

    From the Data Grid CLI, use the schema command with the --upload= argument as follows:

    schema --upload=person.proto person.proto
  2. Create a cache named people that uses ProtoStream encoding and configures Data Grid to index entities declared in your Protobuf schema.

    The following cache indexes the Person entity from the previous step:

    <distributed-cache name="people">
      <encoding media-type="application/x-protostream"/>
      <indexing>
        <indexed-entities>
          <indexed-entity>org.infinispan.example.Person</indexed-entity>
        </indexed-entities>
      </indexing>
    </distributed-cache>

    From the CLI, use the create cache command with the --file= argument as follows:

    create cache --file=people.xml people
  3. Add entries to the cache.

    To query a remote cache, it needs to contain some data. For this example procedure, create entries that use the following JSON values:

    PersonOne

    {
      "_type":"org.infinispan.example.Person",
      "id":1,
      "name":"Person",
      "surname":"One",
      "age":44
    }

    PersonTwo

    {
      "_type":"org.infinispan.example.Person",
      "id":2,
      "name":"Person",
      "surname":"Two",
      "age":27
    }

    PersonThree

    {
      "_type":"org.infinispan.example.Person",
      "id":3,
      "name":"Person",
      "surname":"Three",
      "age":35
    }

    From the CLI, use the put command with the --file= argument to add each entry, as follows:

    put --encoding=application/json --file=personone.json personone
    Tip

    From Data Grid Console, you must select Custom Type for the Value content type field when you add values in JSON format with custom types .

  4. Query your remote cache.

    From the CLI, use the query command from the context of the remote cache.

    query "from org.infinispan.example.Person p WHERE p.name='Person' ORDER BY p.age ASC"

    The query returns all entries with a name that matches Person by age in ascending order.

Additional resources

3.3. Using analyzers with remote caches

Analyzers convert input data into terms that you can index and query. You specify analyzer definitions with the @Field annotation in your Java classes or directly in Protobuf schema.

Procedure

  1. Include the Analyze.YES attribute to indicate that the property is analyzed.
  2. Specify the analyzer definition with the @Analyzer annotation.

Protobuf schema

/* @Indexed */
message TestEntity {

    /* @Field(store = Store.YES, analyze = Analyze.YES, analyzer = @Analyzer(definition = "keyword")) */
    optional string id = 1;

    /* @Field(store = Store.YES, analyze = Analyze.YES, analyzer = @Analyzer(definition = "simple")) */
    optional string name = 2;
}

Java classes

@ProtoDoc("@Field(store = Store.YES, analyze = Analyze.YES, analyzer = @Analyzer(definition = \"keyword\"))")
@ProtoField(1)
final String id;

@ProtoDoc("@Field(store = Store.YES, analyze = Analyze.YES, analyzer = @Analyzer(definition = \"simple\"))")
@ProtoField(2)
final String description;

3.3.1. Default analyzer definitions

Data Grid provides a set of default analyzer definitions.

DefinitionDescription

standard

Splits text fields into tokens, treating whitespace and punctuation as delimiters.

simple

Tokenizes input streams by delimiting at non-letters and then converting all letters to lowercase characters. Whitespace and non-letters are discarded.

whitespace

Splits text streams on whitespace and returns sequences of non-whitespace characters as tokens.

keyword

Treats entire text fields as single tokens.

stemmer

Stems English words using the Snowball Porter filter.

ngram

Generates n-gram tokens that are 3 grams in size by default.

filename

Splits text fields into larger size tokens than the standard analyzer, treating whitespace as a delimiter and converts all letters to lowercase characters.

These analyzer definitions are based on Apache Lucene and are provided "as-is". For more information about tokenizers, filters, and CharFilters, see the appropriate Lucene documentation.

3.3.2. Creating custom analyzer definitions

Create custom analyzer definitions and add them to your Data Grid Server installations.

Prerequisites

  • Stop Data Grid Server if it is running.

    Data Grid Server loads classes at startup only.

Procedure

  1. Implement the ProgrammaticSearchMappingProvider API.
  2. Package your implementation in a JAR with the fully qualified class (FQN) in the following file:

    META-INF/services/org.infinispan.query.spi.ProgrammaticSearchMappingProvider
  3. Copy your JAR file to the server/lib directory of your Data Grid Server installation.
  4. Start Data Grid Server.

ProgrammaticSearchMappingProvider example

import org.apache.lucene.analysis.core.LowerCaseFilterFactory;
import org.apache.lucene.analysis.core.StopFilterFactory;
import org.apache.lucene.analysis.standard.StandardFilterFactory;
import org.apache.lucene.analysis.standard.StandardTokenizerFactory;
import org.hibernate.search.cfg.SearchMapping;
import org.infinispan.Cache;
import org.infinispan.query.spi.ProgrammaticSearchMappingProvider;

public final class MyAnalyzerProvider implements ProgrammaticSearchMappingProvider {

   @Override
   public void defineMappings(Cache cache, SearchMapping searchMapping) {
      searchMapping
            .analyzerDef("standard-with-stop", StandardTokenizerFactory.class)
               .filter(StandardFilterFactory.class)
               .filter(LowerCaseFilterFactory.class)
               .filter(StopFilterFactory.class);
   }
}