Chapter 13. Setting up cross-site replication

Ensure availability with Data Grid Operator by configuring geographically distributed clusters as a unified service.

You can configure clusters to perform cross-site replication with:

  • Connections that Data Grid Operator manages.
  • Connections that you configure and manage.
Note

You can use both managed and manual connections for Data Grid clusters in the same Infinispan CR. You must ensure that Data Grid clusters establish connections in the same way at each site.

13.1. Cross-site replication expose types

You can use a NodePort service, a LoadBalancer service, or an OpenShift Route to handle network traffic for backup operations between Data Grid clusters. Before you start setting up cross-site replication you should determine what expose type is available for your Red Hat OpenShift cluster. In some cases you may require an administrator to provision services before you can configure an expose type.

NodePort

A NodePort is a service that accepts network traffic at a static port, in the 30000 to 32767 range, on an IP address that is available externally to the OpenShift cluster.

To use a NodePort as the expose type for cross-site replication, an administrator must provision external IP addresses for each OpenShift node. In most cases, an administrator must also configure DNS routing for those external IP addresses.

LoadBalancer

A LoadBalancer is a service that directs network traffic to the correct node in the OpenShift cluster.

Whether you can use a LoadBalancer as the expose type for cross-site replication depends on the host platform. AWS supports network load balancers (NLB) while some other cloud platforms do not. To use a LoadBalancer service, an administrator must first create an ingress controller backed by an NLB.

Route

An OpenShift Route allows Data Grid clusters to connect with each other through a public secure URL.

Data Grid uses TLS with the SNI header to send backup requests between clusters through an OpenShift Route. To do this you must add a keystore with TLS certificates so that Data Grid can encrypt network traffic for cross-site replication.

When you specify Route as the expose type for cross-site replication, Data Grid Operator creates a route with TLS passthrough encryption for each Data Grid cluster that it manages. You can specify a hostname for the Route but you cannot specify a Route that you have already created.

13.2. Managed cross-site replication

Data Grid Operator can discover Data Grid clusters running in different data centers to form global clusters.

When you configure managed cross-site connections, Data Grid Operator creates router pods in each Data Grid cluster. Data Grid pods use the <cluster_name>-site service to connect to these router pods and send backup requests.

Router pods maintain a record of all pod IP addresses and parse RELAY message headers to forward backup requests to the correct Data Grid cluster. If a router pod crashes then all Data Grid pods start using any other available router pod until OpenShift restores it.

Important

To manage cross-site connections, Data Grid Operator uses the Kubernetes API. Each OpenShift cluster must have network access to the remote Kubernetes API and a service account token for each backup cluster.

Note

Data Grid clusters do not start running until Data Grid Operator discovers all backup locations that you configure.

13.2.1. Creating service account tokens for managed cross-site connections

Generate service account tokens on OpenShift clusters that allow Data Grid Operator to automatically discover Data Grid clusters and manage cross-site connections.

Prerequisites

  • Ensure all OpenShift clusters have access to the Kubernetes API.
    Data Grid Operator uses this API to manage cross-site connections.

    Note

    Data Grid Operator does not modify remote Data Grid clusters. The service account tokens provide read only access through the Kubernetes API.

Procedure

  1. Log in to an OpenShift cluster.
  2. Create a service account.

    For example, create a service account at LON:

    oc create sa lon
    serviceaccount/lon created
  3. Add the view role to the service account with the following command:

    oc policy add-role-to-user view system:serviceaccount:<namespace>:lon
  4. If you use a NodePort service to expose Data Grid clusters on the network, you must also add the cluster-reader role to the service account:

    oc adm policy add-cluster-role-to-user cluster-reader -z <service-account-name> -n <namespace>
  5. Repeat the preceding steps on your other OpenShift clusters.
  6. Exchange service account tokens on each OpenShift cluster.

13.2.2. Exchanging service account tokens

After you create service account tokens on your OpenShift clusters, you add them to secrets on each backup location. For example, at LON you add the service account token for NYC. At NYC you add the service account token for LON.

Prerequisites

  • Get tokens from each service account.

    Use the following command or get the token from the OpenShift Web Console:

    oc sa get-token lon
    
    eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9...

Procedure

  1. Log in to an OpenShift cluster.
  2. Add the service account token for a backup location with the following command:

    oc create secret generic <token-name> --from-literal=token=<token>

    For example, log in to the OpenShift cluster at NYC and create a lon-token secret as follows:

    oc create secret generic lon-token --from-literal=token=eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9...
  3. Repeat the preceding steps on your other OpenShift clusters.

13.2.3. Configuring managed cross-site connections

Configure Data Grid Operator to establish cross-site views with Data Grid clusters.

Prerequisites

  • Determine a suitable expose type for cross-site replication.
    If you use an OpenShift Route you must add a keystore with TLS certificates and secure cross-site connections.
  • Create and exchange Red Hat OpenShift service account tokens for each Data Grid cluster.

Procedure

  1. Create an Infinispan CR for each Data Grid cluster.
  2. Specify the name of the local site with spec.service.sites.local.name.
  3. Configure the expose type for cross-site replication.

    1. Set the value of the spec.service.sites.local.expose.type field to one of the following:

      • NodePort
      • LoadBalancer
      • Route
    2. Optionally specify a port or custom hostname with the following fields:

      • spec.service.sites.local.expose.nodePort if you use a NodePort service.
      • spec.service.sites.local.expose.port if you use a LoadBalancer service.
      • spec.service.sites.local.expose.routeHostName if you use an OpenShift Route.
  4. Specify the number of pods that can send RELAY messages with the service.sites.local.maxRelayNodes field.

    Tip

    Configure all pods in your cluster to send RELAY messages for better performance. If all pods send backup requests directly, then no pods need to forward backup requests.

  5. Provide the name, URL, and secret for each Data Grid cluster that acts as a backup location with spec.service.sites.locations.
  6. If Data Grid cluster names or namespaces at the remote site do not match the local site, specify those values with the clusterName and namespace fields.

    The following are example Infinispan CR definitions for LON and NYC:

    • LON

      apiVersion: infinispan.org/v1
      kind: Infinispan
      metadata:
        name: infinispan
      spec:
        replicas: 3
        service:
          type: DataGrid
          sites:
            local:
              name: LON
              expose:
                type: LoadBalancer
                port: 65535
              maxRelayNodes: 1
            locations:
              - name: NYC
                clusterName: <nyc_cluster_name>
                namespace: <nyc_cluster_namespace>
                url: openshift://api.rhdg-nyc.openshift-aws.myhost.com:6443
                secretName: nyc-token
        logging:
          categories:
            org.jgroups.protocols.TCP: error
            org.jgroups.protocols.relay.RELAY2: error
    • NYC

      apiVersion: infinispan.org/v1
      kind: Infinispan
      metadata:
        name: nyc-cluster
      spec:
        replicas: 2
        service:
          type: DataGrid
          sites:
            local:
              name: NYC
              expose:
                type: LoadBalancer
                port: 65535
              maxRelayNodes: 1
            locations:
              - name: LON
                clusterName: infinispan
                namespace: rhdg-namespace
                url: openshift://api.rhdg-lon.openshift-aws.myhost.com:6443
                secretName: lon-token
        logging:
          categories:
            org.jgroups.protocols.TCP: error
            org.jgroups.protocols.relay.RELAY2: error
      Important

      Be sure to adjust logging categories in your Infinispan CR to decrease log levels for JGroups TCP and RELAY2 protocols. This prevents a large number of log files from uses container storage.

      spec:
        logging:
          categories:
            org.jgroups.protocols.TCP: error
            org.jgroups.protocols.relay.RELAY2: error
  7. Configure your Infinispan CRs with any other Data Grid service resources and then apply the changes.
  8. Verify that Data Grid clusters form a cross-site view.

    1. Retrieve the Infinispan CR.

      oc get infinispan -o yaml
    2. Check for the type: CrossSiteViewFormed condition.

Next steps

If your clusters have formed a cross-site view, you can start adding backup locations to caches.

13.3. Manually configuring cross-site connections

You can specify static network connection details to perform cross-site replication with Data Grid clusters running outside OpenShift. Manual cross-site connections are necessary in any scenario where access to the Kubernetes API is not available outside the OpenShift cluster where Data Grid runs.

Prerequisites

  • Determine a suitable expose type for cross-site replication.
    If you use an OpenShift Route you must add a keystore with TLS certificates and secure cross-site connections.
  • Ensure you have the correct host names and ports for each Data Grid cluster and each <cluster-name>-site service.

    Manually connecting Data Grid clusters to form cross-site views requires predictable network locations for Data Grid services, which means you need to know the network locations before they are created.

Procedure

  1. Create an Infinispan CR for each Data Grid cluster.
  2. Specify the name of the local site with spec.service.sites.local.name.
  3. Configure the expose type for cross-site replication.

    1. Set the value of the spec.service.sites.local.expose.type field to one of the following:

      • NodePort
      • LoadBalancer
      • Route
    2. Optionally specify a port or custom hostname with the following fields:

      • spec.service.sites.local.expose.nodePort if you use a NodePort service.
      • spec.service.sites.local.expose.port if you use a LoadBalancer service.
      • spec.service.sites.local.expose.routeHostName if you use an OpenShift Route.
  4. Provide the name and static URL for each Data Grid cluster that acts as a backup location with spec.service.sites.locations, for example:

    • LON

      apiVersion: infinispan.org/v1
      kind: Infinispan
      metadata:
        name: infinispan
      spec:
        replicas: 3
        service:
          type: DataGrid
          sites:
            local:
              name: LON
              expose:
                type: LoadBalancer
                port: 65535
              maxRelayNodes: 1
            locations:
              - name: NYC
                url: infinispan+xsite://infinispan-nyc.myhost.com:7900
        logging:
          categories:
            org.jgroups.protocols.TCP: error
            org.jgroups.protocols.relay.RELAY2: error
    • NYC

      apiVersion: infinispan.org/v1
      kind: Infinispan
      metadata:
        name: infinispan
      spec:
        replicas: 2
        service:
          type: DataGrid
          sites:
            local:
              name: NYC
              expose:
                type: LoadBalancer
                port: 65535
              maxRelayNodes: 1
            locations:
              - name: LON
                url: infinispan+xsite://infinispan-lon.myhost.com
        logging:
          categories:
            org.jgroups.protocols.TCP: error
            org.jgroups.protocols.relay.RELAY2: error
      Important

      Be sure to adjust logging categories in your Infinispan CR to decrease log levels for JGroups TCP and RELAY2 protocols. This prevents a large number of log files from uses container storage.

      spec:
        logging:
          categories:
            org.jgroups.protocols.TCP: error
            org.jgroups.protocols.relay.RELAY2: error
  5. Configure your Infinispan CRs with any other Data Grid service resources and then apply the changes.
  6. Verify that Data Grid clusters form a cross-site view.

    1. Retrieve the Infinispan CR.

      oc get infinispan -o yaml
    2. Check for the type: CrossSiteViewFormed condition.

Next steps

If your clusters have formed a cross-site view, you can start adding backup locations to caches.

13.4. Resources for configuring cross-site replication

The following tables provides fields and descriptions for cross-site resources.

Table 13.1. service.type

FieldDescription

service.type: DataGrid

Data Grid supports cross-site replication with Data Grid service clusters only.

Table 13.2. service.sites.local

FieldDescription

service.sites.local.name

Names the local site where a Data Grid cluster runs.

service.sites.local.expose.type

Specifies the network service for cross-site replication. Data Grid clusters use this service to communicate and perform backup operations. You can set the value to NodePort, LoadBalancer, or Route.

service.sites.local.expose.nodePort

Specifies a static port within the default range of 30000 to 32767 if you expose Data Grid through a NodePort service. If you do not specify a port, the platform selects an available one.

service.sites.local.expose.port

Specifies the network port for the service if you expose Data Grid through a LoadBalancer service. The default port is 7900.

service.sites.local.expose.routeHostName

Specifies a custom hostname if you expose Data Grid through an OpenShift Route. If you do not set a value then OpenShift generates a hostname.

service.sites.local.maxRelayNodes

Specifies the maximum number of pods that can send RELAY messages for cross-site replication. The default value is 1.

Table 13.3. service.sites.locations

FieldDescription

service.sites.locations

Provides connection information for all backup locations.

service.sites.locations.name

Specifies a backup location that matches .spec.service.sites.local.name.

service.sites.locations.url

Specifies the URL of the Kubernetes API for managed connections or a static URL for manual connections.

Use openshift:// to specify the URL of the Kubernetes API for an OpenShift cluster.

Note that the openshift:// URL must present a valid, CA-signed certificate. You cannot use self-signed certificates.

Use the infinispan+xsite://<hostname>:<port> format for static hostnames and ports. The default port is 7900.

service.sites.locations.secretName

Specifies the secret that contains the service account token for the backup site.

service.sites.locations.clusterName

Specifies the cluster name at the backup location if it is different to the cluster name at the local site.

service.sites.locations.namespace

Specifies the namespace of the Data Grid cluster at the backup location if it does not match the namespace at the local site.

Managed cross-site connections

spec:
  service:
    type: DataGrid
    sites:
      local:
        name: LON
        expose:
          type: LoadBalancer
        maxRelayNodes: 1
      locations:
      - name: NYC
        clusterName: <nyc_cluster_name>
        namespace: <nyc_cluster_namespace>
        url: openshift://api.site-b.devcluster.openshift.com:6443
        secretName: nyc-token

Manual cross-site connections

spec:
  service:
    type: DataGrid
    sites:
      local:
        name: LON
        expose:
          type: LoadBalancer
          port: 65535
        maxRelayNodes: 1
      locations:
      - name: NYC
        url: infinispan+xsite://infinispan-nyc.myhost.com:7900

13.5. Securing cross-site connections

Add keystores and trust stores so that Data Grid clusters can secure cross-site replication traffic.

You must add a keystore to use an OpenShift Route as the expose type for cross-site replication. Securing cross-site connections is optional if you use a NodePort or LoadBalancer as the expose type.

Prerequisites

  • Have a PKCS12 keystore that Data Grid can use to encrypt and decrypt RELAY messages.

    You must provide a keystore for relay pods and router pods to secure cross-site connections.
    The keystore can be the same for relay pods and router pods or you can provide separate keystores for each.
    You can also use the same keystore for each Data Grid cluster or a unique keystore for each cluster.

  • Optionally have a trust store that contains part of the certificate chain or root CA certificate that verifies public certificates for Data Grid relay pods and router pods.

    By default, Data Grid uses the Java trust store to verify public certificates.

Procedure

  1. Create cross-site encryption secrets.

    1. Create keystore secrets.
    2. Create trust store secrets if you do not want to use the default Java trust store.
  2. Modify the Infinispan CR for each Data Grid cluster to specify the secret name for the encryption.transportKeyStore.secretName and encryption.routerKeyStore.secretName fields.
  3. Configure any other fields to encrypt RELAY messages as required and then apply the changes.

    apiVersion: infinispan.org/v1
    kind: Infinispan
    metadata:
      name: infinispan
    spec:
      replicas: 2
      expose:
        type: LoadBalancer
      service:
        type: DataGrid
        sites:
          local:
            name: SiteA
            # ...
            encryption:
              protocol: TLSv1.3
              transportKeyStore:
                secretName: transport-tls-secret
                alias: transport
                filename: keystore.p12
              routerKeyStore:
                secretName: router-tls-secret
                alias: router
                filename: keystore.p12
              trustStore:
                secretName: truststore-tls-secret
                filename: truststore.p12
          locations:
            # ...

13.5.1. Resources for configuring cross-site encryption

The following tables provides fields and descriptions for encrypting cross-site connections.

Table 13.4. service.type.sites.local.encryption

FieldDescription

service.type.sites.local.encryption.protocol

Specifies the TLS protocol to use for cross-site connections. The default value is TLSv1.2 but you can set TLSv1.3 if required.

service.type.sites.local.encryption.transportKeyStore

Configures a keystore secret for relay pods.

service.type.sites.local.encryption.routerKeyStore

Configures a keystore secret for router pods.

service.type.sites.local.encryption.trustStore

Configures an optional trust store secret for relay pods and router pods.

Table 13.5. service.type.sites.local.encryption.transportKeyStore

FieldDescription

secretName

Specifies the secret that contains a keystore that relay pods can use to encrypt and decrypt RELAY messages. This field is required.

alias

Optionally specifies the alias of the certificate in the keystore. The default value is transport.

filename

Optionally specifies the filename of the keystore. The default value is keystore.p12.

Table 13.6. service.type.sites.local.encryption.routerKeyStore

FieldDescription

secretName

Specifies the secret that contains a keystore that router pods can use to encrypt and decrypt RELAY messages. This field is required.

alias

Optionally specifies the alias of the certificate in the keystore. The default value is router.

filename

Optionally specifies the filename of the keystore. The default value is keystore.p12.

Table 13.7. service.type.sites.local.encryption.trustStore

FieldDescription

secretName

Optionally specifies the secret that contains a trust store to verify public certificates for relay pods and router pods. The default value is <cluster-name>-truststore-site-tls-secret.

filename

Optionally specifies the filename of the trust store. The default value is truststore.p12.

13.5.2. Cross-site encryption secrets

Cross-site replication encryption secrets add keystores and optional trust stores for securing cross-site connections.

Cross-site encryption secrets

apiVersion: v1
kind: Secret
metadata:
  name: tls-secret
type: Opaque
stringData:
  password: changeme
  type: pkcs12
data:
  <file-name>: "MIIKDgIBAzCCCdQGCSqGSIb3DQEHA..."

FieldDescription

stringData.password

Specifies the password for the keystore or trust store.

stringData.type

Optionally specifies the keystore or trust store type. The default value is pkcs12.

data.<file-name>

Adds a base64-encoded keystore or trust store.

13.6. Configuring sites in the same OpenShift cluster

For evaluation and demonstration purposes, you can configure Data Grid to back up between pods in the same OpenShift cluster.

Important

Using ClusterIP as the expose type for cross-site replication is intended for demonstration purposes only. It would be appropriate to use this expose type only to perform a temporary proof-of-concept deployment on a laptop or something of that nature.

Procedure

  1. Create an Infinispan CR for each Data Grid cluster.
  2. Specify the name of the local site with spec.service.sites.local.name.
  3. Set ClusterIP as the value of the spec.service.sites.local.expose.type field.
  4. Provide the name of the Data Grid cluster that acts as a backup location with spec.service.sites.locations.clusterName.
  5. If both Data Grid clusters have the same name, specify the namespace of the backup location with spec.service.sites.locations.namespace.

    apiVersion: infinispan.org/v1
    kind: Infinispan
    metadata:
      name: example-clustera
    spec:
      replicas: 1
      expose:
        type: LoadBalancer
      service:
        type: DataGrid
        sites:
          local:
            name: SiteA
            expose:
              type: ClusterIP
            maxRelayNodes: 1
          locations:
            - name: SiteB
              clusterName: example-clusterb
              namespace: cluster-namespace
  6. Configure your Infinispan CRs with any other Data Grid service resources and then apply the changes.
  7. Verify that Data Grid clusters form a cross-site view.

    1. Retrieve the Infinispan CR.

      oc get infinispan -o yaml
    2. Check for the type: CrossSiteViewFormed condition.