17.5. Protobuf Encoding

The Infinispan Query DSL can be used remotely via the Hot Rod client. In order to do this, protocol buffers are used to adopt a common format for storing cache entries and marshalling them.

17.5.1. Storing Protobuf Encoded Entities

Protobuf requires data to be structured. This is achieved by declaring Protocol Buffer message types in .proto files
For example:

Example 17.5. .library.proto

package book_sample;  
message Book {      
    required string title = 1;
    required string description = 2;
    required int32 publicationYear = 3; // no native Date type available in Protobuf
      
    repeated Author authors = 4;
}
message Author {
    required string name = 1;
    required string surname = 2;
}
The provided example:
  1. An entity named Book is placed in a package named book_sample.
    package book_sample;  
    message Book {
  2. The entity declares several fields of primitive types and a repeatable field named authors.
        required string title = 1;
        required string description = 2;
        required int32 publicationYear = 3; // no native Date type available in Protobuf
          
        repeated Author authors = 4;
    }
  3. The Author message instances are embedded in the Book message instance.
    message Author {
        required string name = 1;
        required string surname = 2;
    }

17.5.2. About Protobuf Messages

There are a few important things to note about Protobuf messages:
  • Nesting of messages is possible, however the resulting structure is strictly a tree, and never a graph.
  • There is no type inheritance.
  • Collections are not supported, however arrays can be easily emulated using repeated fields.

17.5.3. Using Protobuf with Hot Rod

Protobuf can be used with JBoss Data Grid's Hot Rod using the following two steps:
  1. Configure the client to use a dedicated marshaller, in this case, the ProtoStreamMarshaller. This marshaller uses the ProtoStream library to assist in encoding objects.

    Important

    If the infinispan-remote jar is not in use, then the infinispan-remote-query-client Maven dependency must be added to use the ProtoStreamMarshaller.
  2. Instruct ProtoStream library on how to marshall message types by registering per entity marshallers.

Example 17.6. Use the ProtoStreamMarshaller to Encode and Marshall Messages

import org.infinispan.client.hotrod.configuration.ConfigurationBuilder;
import org.infinispan.client.hotrod.marshall.ProtoStreamMarshaller;
import org.infinispan.protostream.FileDescriptorSource;
import org.infinispan.protostream.SerializationContext;
...
ConfigurationBuilder clientBuilder = new ConfigurationBuilder();
clientBuilder.addServer()
    .host("127.0.0.1").port(11234)
    .marshaller(new ProtoStreamMarshaller());
    
RemoteCacheManager remoteCacheManager = new RemoteCacheManager(clientBuilder.build());
SerializationContext serCtx = 
    ProtoStreamMarshaller.getSerializationContext(remoteCacheManager);
serCtx.registerProtoFiles(FileDescriptorSource.fromResources("/library.proto"));
serCtx.registerMarshaller(new BookMarshaller());
serCtx.registerMarshaller(new AuthorMarshaller());
// Book and Author classes omitted for brevity
In the provided example,
  • The SerializationContext is provided by the ProtoStream library.
  • The SerializationContext.registerProtofile method receives the name of a .proto classpath resource file that contains the message type definitions.
  • The SerializationContext associated with the RemoteCacheManager is obtained, then ProtoStream is instructed to marshall the protobuf types.

Note

A RemoteCacheManager has no SerializationContext associated with it unless it was configured to use ProtoStreamMarshaller.

17.5.4. Registering Per Entity Marshallers

When using the ProtoStreamMarshaller for remote querying purposes, registration of per entity marshallers for domain model types must be provided by the user for each type or marshalling will fail. When writing marshallers, it is essential that they are stateless and threadsafe, as a single instance of them is being used.
The following example shows how to write a marshaller.

Example 17.7. BookMarshaller.java

import org.infinispan.protostream.MessageMarshaller;
...
public class BookMarshaller implements MessageMarshaller<Book> {
    @Override
    public String getTypeName() {
        return "book_sample.Book";
    }
    @Override
    public Class<? extends Book> getJavaClass() {
        return Book.class;
    }
    @Override
    public void writeTo(ProtoStreamWriter writer, Book book) throws IOException {
        writer.writeString("title", book.getTitle());
        writer.writeString("description", book.getDescription());
        writer.writeCollection("authors", book.getAuthors(), Author.class);
    }
    @Override
    public Book readFrom(ProtoStreamReader reader) throws IOException {
        String title = reader.readString("title");
        String description = reader.readString("description");
        int publicationYear = reader.readInt("publicationYear");
        Set<Author> authors = reader.readCollection("authors", 
            new HashSet<Author>(), Author.class);
        return new Book(title, description, publicationYear, authors);
    }
}
Once the client has been set up, reading and writing Java objects to the remote cache will use the entity marshallers. The actual data stored in the cache will be protobuf encoded, provided that marshallers were registered with the remote client for all involved types. In the provided example, this would be Book and Author.
Objects stored in protobuf format are able to be utilized with compatible clients written in different languages.

17.5.5. Indexing Protobuf Encoded Entities

Once the client is configured to use Protobuf, indexing can be configured for caches on the server side.
To index the entries, the server must have the knowledge of the message types defined by the Protobuf schema. A Protobuf schema file is defined in a file with a .proto extension. The schema is supplied to the server either by placing it in the ___protobuf_metadata cache by a put, putAll, putIfAbsent, or replace operation, or alternatively by invoking ProtobufMetadataManager MBean via JMX. Both keys and values of ___protobuf_metadata cache are Strings, the key being the file name, while the value is the schema file contents.

Example 17.8. Registering a Protocol Buffers schema file

import org.infinispan.client.hotrod.RemoteCache;
import org.infinispan.client.hotrod.RemoteCacheManager;
import org.infinispan.query.remote.client.ProtobufMetadataManagerConstants;

RemoteCacheManager remoteCacheManager = ... // obtain a RemoteCacheManager 

// obtain the '__protobuf_metadata' cache
RemoteCache<String, String> metadataCache = 
    remoteCacheManager.getCache(
        ProtobufMetadataManagerConstants.PROTOBUF_METADATA_CACHE_NAME);

String schemaFileContents = ... // this is the contents of the schema file
metadataCache.put("my_protobuf_schema.proto", schemaFileContents);
The ProtobufMetadataManager is a cluster-wide replicated repository of Protobuf schema definitions or.proto files. For each running cache manager, a separate ProtobufMetadataManager MBean instance exists, and is backed by the ___protobuf_metadata cache. The ProtobufMetadataManager ObjectName uses the following pattern:
<jmx domain>:type=RemoteQuery,
    name=<cache manager<methodname>putAllname>,
    component=ProtobufMetadataManager
The following signature is used by the method that registers the Protobuf schema file:
void registerProtofile(String name, String contents)
If indexing is enabled for a cache, all fields of Protobuf-encoded entries are indexed. All Protobuf-encoded entries are searchable, regardless of whether indexing is enabled.

Note

Indexing is recommended for improved performance but is not mandatory when using remote queries. Using indexing improves the searching speed but can also reduce the insert/update speeds due to the overhead required to maintain the index.

17.5.6. Custom Fields Indexing with Protobuf

All Protobuf type fields are indexed and stored by default. This behavior is acceptable in most cases but it can result in performance issues if used with too many or very large fields. To specify the fields to index and store, use the @Indexed and @IndexedField annotations directly to the Protobuf schema in the documentation comments of message type definitions and field definitions.

Example 17.9. Specifying Which Fields are Indexed

/* 
  This type is indexed, but not all its fields are.
  @Indexed 
*/
message Note {

  /* 
     This field is indexed but not stored. 
     It can be used for querying but not for projections.
     @IndexedField(index=true, store=false) 
   */
  optional string text = 1;

  /* 
    A field that is both indexed and stored.
    @IndexedField 
   */
  optional string author = 2;

  /* @IndexedField(index=false, store=true) */
  optional bool isRead = 3;

  /* This field is not annotated, so it is neither indexed nor stored. */
  optional int32 priority;  
}
Add documentation annotations to the last line of the documentation comment that precedes the element to be annotated (message type or field definition).
The @Indexed annotation only applies to message types, has a boolean value (the default is true). As a result, using @Indexed is equivalent to @Indexed(true). This annotation is used to selectively specify the fields of the message type which must be indexed. Using @Indexed(false), however, indicates that no fields are to be indexed and so the eventual @IndexedField annotation at the field level is ignored.
The @IndexedField annotation only applies to fields, has two attributes (index and store), both of which default to true (using @IndexedField is equivalent to @IndexedField(index=true, store=true)). The index attribute indicates whether the field is indexed, and is therefore used for indexed queries. The store attributes indicates whether the field value must be stored in the index, so that the value is available for projections.

Note

The @IndexedField annotation is only effective if the message type that contains it is annotated with @Indexed.

17.5.7. Defining Protocol Buffers Schemas With Java Annotations

You can declare Protobuf metadata using Java annotations. Instead of providing a MessageMarshaller implementation and a .proto schema file, you can add minimal annotations to a Java class and its fields.
The objective of this method is to marshal Java objects to protobuf using the ProtoStream library. The ProtoStream library internally generates the marshallar and does not require a manually implemented one. The Java annotations require minimal information such as the Protobuf tag number. The rest is inferred based on common sense defaults ( Protobuf type, Java collection type, and collection element type) and is possible to override.
The auto-generated schema is registered with the SerializationContext and is also available to the users to be used as a reference to implement domain model classes and marshallers for other languages.
The following are examples of Java annotations

Example 17.10. User.Java

package sample;

import org.infinispan.protostream.annotations.ProtoEnum;
import org.infinispan.protostream.annotations.ProtoEnumValue;
import org.infinispan.protostream.annotations.ProtoField;
import org.infinispan.protostream.annotations.ProtoMessage;

@ProtoMessage(name = "ApplicationUser")
public class User {

    @ProtoEnum(name = "Gender")
    public enum Gender {
        @ProtoEnumValue(number = 1, name = "M")
        MALE,

        @ProtoEnumValue(number = 2, name = "F")
        FEMALE
    }

    @ProtoField(number = 1, required = true)
    public String name;

    @ProtoField(number = 2)
    public Gender gender;
}

Example 17.11. Note.Java

package sample;
 
import org.infinispan.protostream.annotations.ProtoDoc;
import org.infinispan.protostream.annotations.ProtoField;
 
@ProtoDoc("@Indexed")
public class Note {
 
    private String text;
 
    private User author;
 
    @ProtoDoc("@IndexedField(index = true, store = false)")
    @ProtoField(number = 1)
    public String getText() {
        return text;
    }
 
    public void setText(String text) {
        this.text = text;
    }
 
    @ProtoDoc("@IndexedField")
    @ProtoField(number = 2)
    public User getAuthor() {
        return author;
    }
 
    public void setAuthor(User author) {
        this.author = author;
    }
}

Example 17.12. ProtoSchemaBuilderDemo.Java

import org.infinispan.protostream.SerializationContext;
import org.infinispan.protostream.annotations.ProtoSchemaBuilder;
import org.infinispan.client.hotrod.RemoteCacheManager;
import org.infinispan.client.hotrod.marshall.ProtoStreamMarshaller;

... 

RemoteCacheManager remoteCacheManager = ... // we have a RemoteCacheManager
SerializationContext serCtx = 
    ProtoStreamMarshaller.getSerializationContext(remoteCacheManager);

// generate and register a Protobuf schema and marshallers based 
// on Note class and the referenced classes (User class)
ProtoSchemaBuilder protoSchemaBuilder = new ProtoSchemaBuilder();
String generatedSchema = protoSchemaBuilder
    .fileName("sample_schema.proto")
    .packageName("sample_package")
    .addClass(Note.class)
    .build(serCtx);

// the types can be marshalled now
assertTrue(serCtx.canMarshall(User.class));
assertTrue(serCtx.canMarshall(Note.class));
assertTrue(serCtx.canMarshall(User.Gender.class));

// display the schema file
System.out.println(generatedSchema);
The following is the .proto file that is generated by the ProtoSchemaBuilderDemo.java example.

Example 17.13. Sample_Schema.Proto

package sample_package;
 
 /* @Indexed */
message Note {
 
   /* @IndexedField(index = true, store = false) */
   optional string text = 1;
 
   /* @IndexedField */
   optional ApplicationUser author = 2;
}
 
message ApplicationUser {
   
   enum Gender {   
      M = 1;   
      F = 2;
   }
 
   required string name = 1;
   optional Gender gender = 2;
}
The following table lists the supported Java annotations with its application and parameters.

Table 17.2. Java Annotations

Annotation Applies To Purpose Requirement Parameters
@ProtoDoc Class/Field/Enum/Enum member Specifies the documentation comment that will be attached to the generated Protobuf schema element (message type, field definition, enum type, enum value definition) Optional A single String parameter, the documentation text
@ProtoMessage Class Specifies the name of the generated message type. If missing, the class name if used instead Optional name (String), the name of the generated message type; if missing the Java class name is used by default
@ProtoField Field, Getter or Setter
Specifies the Protobuf field number and its Protobuf type. Also indicates if the field is repeated, optional or required and its (optional) default value. If the Java field type is an interface or an abstract class, its actual type must be indicated. If the field is repeatable and the declared collection type is abstract then the actual collection implementation type must be specified. If this annotation is missing, the field is ignored for marshalling (it is transient). A class must have at least one @ProtoField annotated field to be considered Protobuf marshallable.
Required number (int, mandatory), the Protobuf number type (org.infinispan.protostream.descriptors.Type, optional), the Protobuf type, it can usually be inferred required (boolean, optional)name (String, optional), the Protobuf namejavaType (Class, optional), the actual type, only needed if declared type is abstract collectionImplementation (Class, optional), the actual collection type, only needed if declared type is abstract defaultValue (String, optional), the string must have the proper format according to the Java field type
@ProtoEnum Enum Specifies the name of the generated enum type. If missing, the Java enum name if used instead Optional name (String), the name of the generated enum type; if missing the Java enum name is used by default
@ProtoEnumValue Enum member Specifies the numeric value of the corresponding Protobuf enum value Required number (int, mandatory), the Protobuf number name (String), the Protobuf name; if missing the name of the Java member is used

Note

The @ProtoDoc annotation can be used to provide documentation comments in the generated schema and also allows to inject the @Indexed and @IndexedField annotations where needed (see Section 17.5.6, “Custom Fields Indexing with Protobuf”).