Chapter 7. Setting up Data Grid services

Use Data Grid Operator to create clusters of either Cache service or Data Grid service pods.

7.1. Service types

Services are stateful applications, based on the Data Grid Server image, that provide flexible and robust in-memory data storage. When you create Data Grid clusters you specify either DataGrid or Cache as the service type with the spec.service.type field.

DataGrid service type
Deploy Data Grid clusters with full configuration and capabilities.
Cache service type
Deploy Data Grid clusters with minimal configuration.

Red Hat recommends the DataGrid service type for clusters because it lets you:

  • Back up data across global clusters with cross-site replication.
  • Create caches with any valid configuration.
  • Add file-based cache stores to save data in a persistent volume.
  • Query values across caches using the Data Grid Query API.
  • Use advanced Data Grid features and capabilities.
Important

The Cache service type was designed to provide a convenient way to create a low-latency data store with minimal configuration. Additional development on the Infinispan CRD has shown that the DataGrid CR offers a better approach to achieving this goal, ultimately giving users more choice and less deployment overhead. For this reason, the Cache service type is planned for removal in the next version of the Infinispan CRD and is no longer under active development.

The DataGrid service type continues to benefit from new features and improved tooling to automate complex operations such as cluster upgrades and data migration.

7.2. Creating Data Grid service pods

To use custom cache definitions along with Data Grid capabilities such as cross-site replication, create clusters of Data Grid service pods.

Procedure

  1. Create an Infinispan CR that sets spec.service.type: DataGrid and configures any other Data Grid service resources.

    apiVersion: infinispan.org/v1
    kind: Infinispan
    metadata:
      name: infinispan
    spec:
      replicas: 2
      version: <Data Grid_version>
      service:
        type: DataGrid
    Important

    You cannot change the spec.service.type field after you create pods. To change the service type, you must delete the existing pods and create new ones.

  2. Apply your Infinispan CR to create the cluster.

7.2.1. Data Grid service CR

This topic describes the Infinispan CR for Data Grid service pods.

apiVersion: infinispan.org/v1
kind: Infinispan
metadata:
  name: infinispan
  annotations:
    infinispan.org/monitoring: 'true'
spec:
  replicas: 6
  version: 8.4.6-1
  upgrades:
    type: Shutdown
  service:
    type: DataGrid
    container:
      storage: 2Gi
      # The ephemeralStorage and storageClassName fields are mutually exclusive.
      ephemeralStorage: false
      storageClassName: my-storage-class
    sites:
      local:
      name: azure
      expose:
        type: LoadBalancer
      locations:
      - name: azure
        url: openshift://api.azure.host:6443
        secretName: azure-token
      - name: aws
        clusterName: infinispan
        namespace: rhdg-namespace
        url: openshift://api.aws.host:6443
        secretName: aws-token
  security:
    endpointSecretName: endpoint-identities
    endpointEncryption:
        type: Secret
        certSecretName: tls-secret
  container:
    extraJvmOpts: "-XX:NativeMemoryTracking=summary"
    cpu: "2000m:1000m"
    memory: "2Gi:1Gi"
  logging:
    categories:
      org.infinispan: debug
      org.jgroups: debug
      org.jgroups.protocols.TCP: error
      org.jgroups.protocols.relay.RELAY2: error
  expose:
    type: LoadBalancer
  configMapName: "my-cluster-config"
  configListener:
    enabled: true
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchLabels:
              app: infinispan-pod
              clusterName: infinispan
              infinispan_cr: infinispan
          topologyKey: "kubernetes.io/hostname"
FieldDescription

metadata.name

Names your Data Grid cluster.

metadata.annotations.infinispan.org/monitoring

Automatically creates a ServiceMonitor for your cluster.

spec.replicas

Specifies the number of pods in your cluster.

spec.version

Specifies the Data Grid Server version of your cluster.

spec.upgrades.type

Controls how Data Grid Operator upgrades your Data Grid cluster when new versions become available.

spec.service.type

Configures the type Data Grid service. A value of DataGrid creates a cluster with Data Grid service pods.

spec.service.container

Configures the storage resources for Data Grid service pods.

spec.service.sites

Configures cross-site replication.

spec.security.endpointSecretName

Specifies an authentication secret that contains Data Grid user credentials.

spec.security.endpointEncryption

Specifies TLS certificates and keystores to encrypt client connections.

spec.container

Specifies JVM, CPU, and memory resources for Data Grid pods.

spec.logging

Configures Data Grid logging categories.

spec.expose

Controls how Data Grid endpoints are exposed on the network.

spec.configMapName

Specifies a ConfigMap that contains Data Grid configuration.

spec.configListener.enabled

Creates a listener pod in each Data Grid cluster that allows Data Grid Operator to reconcile server-side modifications with Data Grid resources such as the Cache CR.

The listener pod consumes minimal resources and is enabled by default. Setting a value of false removes the listener pod and disables bi-directional reconciliation. You should do this only if you do not need declarative Kubernetes representations of Data Grid resources created through the Data Grid Console, CLI, or client applications.

spec.configListener.logging.level

Configures the logging level for the ConfigListener deployments. The default level is info. You can change it to debug or error.

spec.affinity

Configures anti-affinity strategies that guarantee Data Grid availability.

7.3. Allocating storage resources

You can allocate storage for Data Grid service pods but not Cache service pods.

By default, Data Grid Operator allocates 1Gi for the persistent volume claim. However you should adjust the amount of storage available to Data Grid service pods so that Data Grid can preserve cluster state during shutdown.

Important

If available container storage is less than the amount of available memory, data loss can occur.

Procedure

  1. Allocate storage resources with the spec.service.container.storage field.
  2. Configure either the ephemeralStorage field or the storageClassName field as required.

    Note

    These fields are mutually exclusive. Add only one of them to your Infinispan CR.

  3. Apply the changes.

Ephemeral storage

spec:
  service:
    type: DataGrid
    container:
      storage: 2Gi
      ephemeralStorage: true

Name of a StorageClass object

spec:
  service:
    type: DataGrid
    container:
      storage: 2Gi
      storageClassName: my-storage-class

FieldDescription

spec.service.container.storage

Specifies the amount of storage for Data Grid service pods.

spec.service.container.ephemeralStorage

Defines whether storage is ephemeral or permanent. Set the value to true to use ephemeral storage, which means all data in storage is deleted when clusters shut down or restart. The default value is false, which means storage is permanent.

spec.service.container.storageClassName

Specifies the name of a StorageClass object to use for the persistent volume claim (PVC). If you include this field, you must specify an existing storage class as the value. If you do not include this field, the persistent volume claim uses the storage class that has the storageclass.kubernetes.io/is-default-class annotation set to true.

7.3.1. Persistent volume claims

Data Grid Operator creates a persistent volume claim (PVC) and mounts container storage at:
/opt/infinispan/server/data

Caches

When you create caches, Data Grid permanently stores their configuration so your caches are available after cluster restarts. This applies to both Cache service and Data Grid service pods.

Data

Data is always volatile in clusters of Cache service pods. When you shutdown the cluster, you permanently lose the data.

Use a file-based cache store, by adding the <file-store/> element to your Data Grid cache configuration, if you want Data Grid service pods to persist data during cluster shutdown.

7.4. Allocating CPU and memory

Allocate CPU and memory resources to Data Grid pods with the Infinispan CR.

Note

Data Grid Operator requests 1Gi of memory from the OpenShift scheduler when creating Data Grid pods. CPU requests are unbounded by default.

Procedure

  1. Allocate the number of CPU units with the spec.container.cpu field.
  2. Allocate the amount of memory, in bytes, with the spec.container.memory field.

    The cpu and memory fields have values in the format of <limit>:<requests>. For example, cpu: "2000m:1000m" limits pods to a maximum of 2000m of CPU and requests 1000m of CPU for each pod at startup. Specifying a single value sets both the limit and request.

  3. Apply your Infinispan CR.

    If your cluster is running, Data Grid Operator restarts the Data Grid pods so changes take effect.

spec:
  container:
    cpu: "2000m:1000m"
    memory: "2Gi:1Gi"

7.5. Setting JVM options

Pass additional JVM options to Data Grid pods at startup.

Procedure

  1. Configure JVM options with the spec.container filed in your Infinispan CR.
  2. Apply your Infinispan CR.

    If your cluster is running, Data Grid Operator restarts the Data Grid pods so changes take effect.

JVM options

spec:
  container:
    extraJvmOpts: "-<option>=<value>"
    routerExtraJvmOpts: "-<option>=<value>"
    cliExtraJvmOpts: "-<option>=<value>"

FieldDescription

spec.container.extraJvmOpts

Specifies additional JVM options for the Data Grid Server.

spec.container.routerExtraJvmOpts

Specifies additional JVM options for the Gossip router.

spec.container.cliExtraJvmOpts

Specifies additional JVM options for the Data Grid CLI.

7.6. Configuring pod priority

Create one or more priority classes to indicate the importance of a pod relative to other pods. Pods with higher priority are scheduled ahead of pods with lower priority, ensuring prioritization of pods running critical workloads, especially when resources become constrained.

Prerequisites

  • Have cluster-admin access to OpenShift.

Procedure

  1. Define a PriorityClass object by specifying its name and value.

    high-priority.yaml

    apiVersion: scheduling.k8s.io/v1
    kind: PriorityClass
    metadata:
      name: high-priority
    value: 1000000
    globalDefault: false
    description: "Use this priority class for high priority service pods only."

  2. Create the priority class.

    oc create -f high-priority.yaml
  3. Reference the priority class name in the pod configuration.

    Infinispan CR

    kind: Infinispan
    ...
    spec:
      scheduling:
        affinity:
          ...
        priorityClassName: "high-priority"
        ...

    You must reference an existing priority class name, otherwise the pod is rejected.

  4. Apply the changes.

7.7. FIPS mode for your Infinispan CR

The Red Hat OpenShift Container Platform can use certain Federal Information Processing Standards (FIPS) components that ensure OpenShift clusters meet the requirements of a FIPS compliance audit.

If you enabled FIPS mode on your OpenShift cluster then the Data Grid Operator automatically enables FIPS mode for your Infinispan custom resource (CR).

Important

Client certificate authentication is not currently supported with FIPS mode. Attempts to create Infinispan CR with spec.security.endpointEncryption.clientCert set to a value other than None will fail.

7.8. Adjusting log levels

Change levels for different Data Grid logging categories when you need to debug issues. You can also adjust log levels to reduce the number of messages for certain categories to minimize the use of container resources.

Procedure

  1. Configure Data Grid logging with the spec.logging.categories field in your Infinispan CR.

    spec:
      logging:
        categories:
          org.infinispan: debug
          org.jgroups: debug
  2. Apply the changes.
  3. Retrieve logs from Data Grid pods as required.

    oc logs -f $POD_NAME

7.8.1. Logging reference

Find information about log categories and levels.

Table 7.1. Log categories

Root categoryDescriptionDefault level

org.infinispan

Data Grid messages

info

org.jgroups

Cluster transport messages

info

Table 7.2. Log levels

Log levelDescription

trace

Provides detailed information about running state of applications. This is the most verbose log level.

debug

Indicates the progress of individual requests or activities.

info

Indicates overall progress of applications, including lifecycle events.

warn

Indicates circumstances that can lead to error or degrade performance.

error

Indicates error conditions that might prevent operations or activities from being successful but do not prevent applications from running.

Garbage collection (GC) messages

Data Grid Operator does not log GC messages by default. You can direct GC messages to stdout with the following JVM options:

extraJvmOpts: "-Xlog:gc*:stdout:time,level,tags"

7.9. Creating Cache service pods

Create Data Grid clusters with Cache service pods for a volatile, low-latency data store with minimal configuration.

Important

Cache service pods provide volatile storage only, which means you lose all data when you modify your Infinispan CR or update the version of your Data Grid cluster.

Procedure

  1. Create an Infinispan CR that sets spec.service.type: Cache and configures any other Cache service resources.

    apiVersion: infinispan.org/v1
    kind: Infinispan
    metadata:
      name: infinispan
    spec:
      replicas: 2
      version: <Data Grid_version>
      service:
        type: Cache
  2. Apply your Infinispan CR to create the cluster.

7.9.1. Cache service CR

This topic describes the Infinispan CR for Cache service pods.

apiVersion: infinispan.org/v1
kind: Infinispan
metadata:
  name: infinispan
  annotations:
    infinispan.org/monitoring: 'true'
spec:
  replicas: 2
  version: 8.4.6-1
  upgrades:
    type: Shutdown
  service:
    type: Cache
    replicationFactor: 2
  autoscale:
    maxMemUsagePercent: 70
    maxReplicas: 5
    minMemUsagePercent: 30
    minReplicas: 2
  security:
    endpointSecretName: endpoint-identities
    endpointEncryption:
        type: Secret
        certSecretName: tls-secret
  container:
    extraJvmOpts: "-XX:NativeMemoryTracking=summary"
    cpu: "2000m:1000m"
    memory: "2Gi:1Gi"
  logging:
    categories:
      org.infinispan: trace
      org.jgroups: trace
  expose:
    type: LoadBalancer
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchLabels:
              app: infinispan-pod
              clusterName: infinispan
              infinispan_cr: infinispan
          topologyKey: "kubernetes.io/hostname"
FieldDescription

metadata.name

Names your Data Grid cluster.

metadata.annotations.infinispan.org/monitoring

Automatically creates a ServiceMonitor for your cluster.

spec.replicas

Specifies the number of pods in your cluster. If you enable autoscaling capabilities, this field specifies the initial number of pods.

spec.version

Specifies the Data Grid Server version of your cluster.

spec.upgrades.type

Controls how Data Grid Operator upgrades your Data Grid cluster when new versions become available.

spec.service.type

Configures the type Data Grid service. A value of Cache creates a cluster with Cache service pods.

spec.service.replicationFactor

Sets the number of copies for each entry across the cluster. The default for Cache service pods is two, which replicates each cache entry to avoid data loss.

spec.autoscale

Enables and configures automatic scaling.

spec.security.endpointSecretName

Specifies an authentication secret that contains Data Grid user credentials.

spec.security.endpointEncryption

Specifies TLS certificates and keystores to encrypt client connections.

spec.container

Specifies JVM, CPU, and memory resources for Data Grid pods.

spec.logging

Configures Data Grid logging categories.

spec.expose

Controls how Data Grid endpoints are exposed on the network.

spec.affinity

Configures anti-affinity strategies that guarantee Data Grid availability.

7.10. Automatic scaling

Data Grid Operator can monitor the default cache on Cache service pods to automatically scale clusters up or down, by creating or deleting pods based on memory usage.

Important

Automatic scaling is available for clusters of Cache service pods only. Data Grid Operator does not perform automatic scaling for clusters of Data Grid service pods.

When you enable automatic scaling, you define memory usage thresholds that let Data Grid Operator determine when it needs to create or delete pods. Data Grid Operator monitors statistics for the default cache and, when memory usage reaches the configured thresholds, scales your clusters up or down.

Maximum threshold

This threshold sets an upper boundary for the amount of memory that pods in your cluster can use before scaling up or performing eviction. When Data Grid Operator detects that any node reaches the maximum amount of memory that you configure, it creates a new node if possible. If Data Grid Operator cannot create a new node then it performs eviction when memory usage reaches 100 percent.

Minimum threshold

This threshold sets a lower boundary for memory usage across your Data Grid cluster. When Data Grid Operator detects that memory usage falls below the minimum, it shuts down pods.

Default cache only

Autoscaling capabilities work with the default cache only. If you plan to add other caches to your cluster, you should not include the autoscale field in your Infinispan CR. In this case you should use eviction to control the size of the data container on each node.

7.10.1. Configuring automatic scaling

If you create clusters with Cache service pods, you can configure Data Grid Operator to automatically scale clusters.

Procedure

  1. Add the spec.autoscale resource to your Infinispan CR to enable automatic scaling.

    Note

    Set a value of true for the autoscale.disabled field to disable automatic scaling.

  2. Configure thresholds for automatic scaling with the following fields:

    FieldDescription

    spec.autoscale.maxMemUsagePercent

    Specifies a maximum threshold, as a percentage, for memory usage on each node.

    spec.autoscale.maxReplicas

    Specifies the maximum number of Cache service pods for the cluster.

    spec.autoscale.minMemUsagePercent

    Specifies a minimum threshold, as a percentage, for cluster memory usage.

    spec.autoscale.minReplicas

    Specifies the minimum number of Cache service pods for the cluster.

    For example, add the following to your Infinispan CR:

    spec:
      service:
        type: Cache
      autoscale:
        disabled: false
        maxMemUsagePercent: 70
        maxReplicas: 5
        minMemUsagePercent: 30
        minReplicas: 2
  3. Apply the changes.

7.11. Adding labels and annotations to Data Grid resources

Attach key/value labels and annotations to pods and services that Data Grid Operator creates and manages. Labels help you identify relationships between objects to better organize and monitor Data Grid resources. Annotations are arbitrary non-identifying metadata for client applications or deployment and management tooling.

Note

Red Hat subscription labels are automatically applied to Data Grid resources.

Procedure

  1. Open your Infinispan CR for editing.
  2. Attach labels and annotations to Data Grid resources in the metadata.annotations section.

    • Define values for annotations directly in the metadata.annotations section.
    • Define values for labels with the metadata.labels field.
  3. Apply your Infinispan CR.

Custom annotations

apiVersion: infinispan.org/v1
kind: Infinispan
metadata:
  annotations:
    infinispan.org/targetAnnotations: service-annotation1, service-annotation2
    infinispan.org/podTargetAnnotations: pod-annotation1, pod-annotation2
    infinispan.org/routerAnnotations: router-annotation1, router-annotation2

    service-annotation1: value
    service-annotation2: value
    pod-annotation1: value
    pod-annotation2: value
    router-annotation1: value
    router-annotation2: value

Custom labels

apiVersion: infinispan.org/v1
kind: Infinispan
metadata:
  annotations:
    infinispan.org/targetLabels: service-label1, service-label2
    infinispan.org/podTargetLabels: pod-label1, pod-label2
  labels:
    service-label1: value
    service-label2: value
    pod-label1: value
    pod-label2: value
    # The operator does not attach these labels to resources.
    my-label: my-value
    environment: development

7.12. Adding labels and annotations with environment variables

Set environment variables for Data Grid Operator to add labels and annotations that automatically propagate to all Data Grid pods and services.

Procedure

Add labels and annotations to your Data Grid Operator subscription with the spec.config.env field in one of the following ways:

  • Use the oc edit subscription command.

    oc edit subscription datagrid -n openshift-operators
  • Use the Red Hat OpenShift Console.

    1. Navigate to Operators > Installed Operators > Data Grid Operator.
    2. From the Actions menu, select Edit Subscription.

Labels and annotations with environment variables

spec:
  config:
    env:
      - name: INFINISPAN_OPERATOR_TARGET_LABELS
        value: |
         {"service-label1":"value",
         service-label1":"value"}
      - name: INFINISPAN_OPERATOR_POD_TARGET_LABELS
        value: |
         {"pod-label1":"value",
         "pod-label2":"value"}
      - name: INFINISPAN_OPERATOR_TARGET_ANNOTATIONS
        value: |
         {"service-annotation1":"value",
         "service-annotation2":"value"}
      - name: INFINISPAN_OPERATOR_POD_TARGET_ANNOTATIONS
        value: |
         {"pod-annotation1":"value",
         "pod-annotation2":"value"}

7.13. Defining environment variables in the Data Grid Operator subscription

You can define environment variables in your Data Grid Operator subscription either when you create or edit the subscription.

Note

If you are using the Red Hat OpenShift Console, you must first install the Data Grid Operator and then edit the existing subscription.

spec.config.env field
Includes the name and value fields to define environment variables.
ADDITIONAL_VARS variable
Includes the names of environment variables in a format of JSON array. Environment variables within the value of the ADDITIONAL_VARS variable automatically propagate to each Data Grid Server pod managed by the associated Operator.

Prerequisites

  • Ensure the Operator Lifecycle Manager (OLM) is installed.
  • Have an oc client.

Procedure

  1. Create a subscription definition YAML for your Data Grid Operator:

    1. Use the spec.config.env field to define environment variables.
    2. Within the ADDITIONAL_VARS variable, include environment variable names in a JSON array.

      subscription-datagrid.yaml

      apiVersion: operators.coreos.com/v1alpha1
      kind: Subscription
      metadata:
        name: datagrid
        namespace: openshift-operators
      spec:
        channel: 8.4.x
        installPlanApproval: Automatic
        name: datagrid
        source: redhat-operators
        sourceNamespace: openshift-marketplace
        config:
          env:
          - name: ADDITIONAL_VARS
            value: "[\"VAR_NAME\", \"ANOTHER_VAR\"]"
          - name: VAR_NAME
            value: $(VAR_NAME_VALUE)
          - name: ANOTHER_VAR
            value: $(ANOTHER_VAR_VALUE)

      For example, use the environment variables to set the local time zone:

      subscription-datagrid.yaml

      kind: Subscription
      spec:
        ...
        config:
          env:
          - name: ADDITIONAL_VARS
            value: "[\"TZ\"]"
          - name: TZ
            value: "JST-9"

  2. Create a subscription for Data Grid Operator:

    oc apply -f subscription-datagrid.yaml

Verification

  • Retrieve the environment variables from the subscription-datagrid.yaml:

    oc get subscription datagrid -n openshift-operators -o jsonpath='{.spec.config.env[*].name}'

Next steps

  1. Use the oc edit subscription command to modify the environment variable:

    oc edit subscription datagrid -n openshift-operators
  2. To ensure the changes take effect on your Data Grid clusters, you must recreate the existing clusters. Terminate the pods by deleting the StatefulSet associated with the existing Infinispan CRs.
  • In the Red Hat OpenShift Console, navigate to Operators > Installed Operators > Data Grid Operator. From the Actions menu, select Edit Subscription.