Menu Close

Chapter 10. Geo-replication

Geo-replication allows multiple, geographically distributed Red Hat Quay deployments to work as a single registry from the perspective of a client or user. It significantly improves push and pull performance in a globally-distributed Red Hat Quay setup. Image data is asynchronously replicated in the background with transparent failover / redirect for clients.

With Red Hat Quay 3.7, deployments of Red Hat Quay with geo-replication is supported by standalone and Operator deployments.

10.1. Geo-replication features

  • When geo-replication is configured, container image pushes will be written to the preferred storage engine for that Red Hat Quay instance (typically the nearest storage backend within the region).
  • After the initial push, image data will be replicated in the background to other storage engines.
  • The list of replication locations is configurable and those can be different storage backends.
  • An image pull will always use the closest available storage engine, to maximize pull performance.
  • If replication hasn’t been completed yet, the pull will use the source storage backend instead.

10.2. Geo-replication requirements and constraints

  • A single database, and therefore all metadata and Quay configuration, is shared across all regions.
  • A single Redis cache is shared across the entire Quay setup and needs to accessible by all Quay pods.
  • The exact same configuration should be used across all regions, with exception of the storage backend, which can be configured explicitly using the QUAY_DISTRIBUTED_STORAGE_PREFERENCE environment variable.
  • Geo-Replication requires object storage in each region. It does not work with local storage or NFS.
  • Each region must be able to access every storage engine in each region (requires a network path).
  • Alternatively, the storage proxy option can be used.
  • The entire storage backend (all blobs) is replicated. This is in contrast to repository mirroring, which can be limited to an organization or repository or image.
  • All Quay instances must share the same entrypoint, typically via load balancer.
  • All Quay instances must have the same set of superusers, as they are defined inside the common configuration file.
  • Geo-replication requires your Clair configuration to be set to unmanaged. An unmanaged Clair database allows the Red Hat Quay Operator to work in a geo-replicated environment, where multiple instances of the Operator must communicate with the same database. For more information, see Advanced Clair configuration.
  • Geo-Replication requires SSL/TSL certificates and keys. For more information, see Using SSL to protect connections to Red Hat Quay.

If the above requirements cannot be met, you should instead use two or more distinct Quay deployments and take advantage of repository mirroring functionality.

10.3. Geo-replication using the Red Hat Quay Operator

Georeplication architecture

In the example shown above, the Red Hat Quay Operator is deployed in two separate regions, with a common database and a common Redis instance. Localized image storage is provided in each region and image pulls are served from the closest available storage engine. Container image pushes are written to the preferred storage engine for the Quay instance, and will then be replicated, in the background, to the other storage engines.

Because the Operator now manages the Clair security scanner and its database separately, geo-replication setups can be leveraged so that they do not manage the Clair database. Instead, an external shared database would be used. Red Hat Quay and Clair support several providers and vendors of PostgreSQL, which can be found in the Quay Enterprise 3.x test matrix. Additionally, the Operator also supports custom Clair configurations that can be injected into the deployment, which allows users to configure Clair with the connection credentials for the external database.

10.3.1. Setting up geo-replication on Openshift

Procedure

  1. Deploy Quay postgres instance:

    1. Login to the database
    2. Create a database for Quay

      CREATE DATABASE quay;
    3. Enable pg_trm extension inside the database

      \c quay;
      CREATE EXTENSION IF NOT EXISTS pg_trgm;
  2. Deploy a Redis instance:

    Note
    • Deploying a Redis instance might be unnecessary if your cloud provider has its own service.
    • Deploying a Redis instance is required if you are leveraging Builders.
    1. Deploy a VM for Redis
    2. Make sure that it is accessible from the clusters where Quay is running
    3. Port 6379/TCP must be open
    4. Run Redis inside the instance

      sudo dnf install -y podman
      podman run -d --name redis -p 6379:6379 redis
  3. Create two object storage backends, one for each cluster

    Ideally one object storage bucket will be close to the 1st cluster (primary) while the other will run closer to the 2nd cluster (secondary).

  4. Deploy the clusters with the same config bundle, using environment variable overrides to select the appropriate storage backend for an individual cluster
  5. Configure a load balancer, to provide a single entry point to the clusters

10.3.1.1. Configuration

The config.yaml file is shared between clusters, and will contain the details for the common PostgreSQL, Redis and storage backends:

config.yaml

DB_CONNECTION_ARGS:
  autorollback: true
  threadlocals: true
DB_URI: postgresql://postgres:password@10.19.0.1:5432/quay 1
BUILDLOGS_REDIS:
  host: 10.19.0.2
  port: 6379
USER_EVENTS_REDIS:
  host: 10.19.0.2
  port: 6379
DISTRIBUTED_STORAGE_CONFIG:
  usstorage:
    - GoogleCloudStorage
    - access_key: GOOGQGPGVMASAAMQABCDEFG
      bucket_name: georep-test-bucket-0
      secret_key: AYWfEaxX/u84XRA2vUX5C987654321
      storage_path: /quaygcp
  eustorage:
    - GoogleCloudStorage
    - access_key: GOOGQGPGVMASAAMQWERTYUIOP
      bucket_name: georep-test-bucket-1
      secret_key: AYWfEaxX/u84XRA2vUX5Cuj12345678
      storage_path: /quaygcp
DISTRIBUTED_STORAGE_DEFAULT_LOCATIONS:
  - usstorage
  - eustorage
DISTRIBUTED_STORAGE_PREFERENCE:
  - usstorage
  - eustorage

1
To retrieve the configuration file for a Clair instance deployed using the OpenShift Operator, see Retrieving the Clair config.

Create the configBundleSecret:

$ oc create secret generic --from-file config.yaml=./config.yaml georep-config-bundle

In each of the clusters, set the configBundleSecret and use the QUAY_DISTRIBUTED_STORAGE_PREFERENCE environmental variable override to configure the appropriate storage for that cluster:

Note

The config.yaml file between both deployments must match. If making a change to one cluster, it must also be changed in the other.

US cluster

apiVersion: quay.redhat.com/v1
kind: QuayRegistry
metadata:
  name: example-registry
  namespace: quay-enterprise
spec:
  configBundleSecret: georep-config-bundle
  components:
    - kind: postgres
      managed: false
    - kind: clairpostgres
      managed: false
    - kind: redis
      managed: false
    - kind: quay
      managed: true
      overrides:
        env:
          - name: QUAY_DISTRIBUTED_STORAGE_PREFERENCE
            value: usstorage

European cluster

apiVersion: quay.redhat.com/v1
kind: QuayRegistry
metadata:
  name: example-registry
  namespace: quay-enterprise
spec:
  configBundleSecret: georep-config-bundle
  components:
    - kind: postgres
      managed: false
    - kind: clairpostgres
      managed: false
    - kind: redis
      managed: false
    - kind: quay
      managed: true
      overrides:
        env:
          - name: QUAY_DISTRIBUTED_STORAGE_PREFERENCE
            value: eustorage

10.3.2. Mixed storage for geo-replication

Red Hat Quay geo-replication supports the use of different and multiple replication targets, for example, using AWS S3 storage on public cloud and using Ceph storage on-prem. This complicates the key requirement of granting access to all storage backends from all Red Hat Quay pods and cluster nodes. As a result, it is recommended that you:

  • Use a VPN to prevent visibility of the internal storage or
  • Use a token pair that only allows access to the specified bucket used by Quay

This will result in the public cloud instance of Red Hat Quay having access to on-prem storage but the network will be encrypted, protected, and will use ACLs, thereby meeting security requirements.

If you cannot implement these security measures, it may be preferable to deploy two distinct Red Hat Quay registries and to use repository mirroring as an alternative to geo-replication.