Chapter 9. 3scale backup and restore

This section provides you, as the administrator of a Red Hat 3scale API Management installation, the information needed to:

  • Set up the backup procedures for persistent data.
  • Perform a restore from backup of the persistent data.

In case of issues with one or more of the MySQL databases, you will be able to restore 3scale correctly to its previous operational state.

9.1. Prerequisites

  • A 3scale 2.13 instance. For more information about how to install 3scale, see Installing 3scale on OpenShift.
  • An OpenShift Container Platform 4.x user account with one of the following roles in the OpenShift cluster:

    • cluster-admin
    • admin
    • edit
Note

A user with an edit cluster role locally binded in the namespace of a 3scale installation can perform backup and restore procedures.

The following contains information about how to set up the backup procedures for persistent data, perform a restore from backup of the persistent data. In case of a failure with one or more of the MySQL databases, I will then be able to restore 3scale correctly to its previous operational state.

9.2. Persistent volumes and considerations

Persistent volumes

In a 3scale deployment on OpenShift:

  • A persistent volume (PV) provided to the cluster by the underlying infrastructure.
  • Storage service external to the cluster. This can be in the same data center or elsewhere.

Considerations

The backup and restore procedures for persistent data vary depending on the storage type in use. To ensure the backups and restores preserve data consistency, it is not sufficient to backup the underlying PVs for a database. For example, do not capture only partial writes and partial transactions. Use the database’s backup mechanisms instead.

Some parts of the data are synchronized between different components. One copy is considered the source of truth for the data set. The other is a copy that is not modified locally, but synchronized from the source of truth. In these cases, upon completion, the source of truth should be restored, and copies in other components synchronized from it.

9.3. Using data sets

This section explains in more detail about different data sets in the different persistent stores, their purpose, the storage type used, and whether or not it is the source of truth.

The full state of a 3scale deployment is stored across the following DeploymentConfig objects and their PVs:

NameDescription

system-mysql

MySQL database (mysql-storage)

system-storage

Volume for Files

backend-redis

Redis database (backend-redis-storage)

system-redis

Redis database (system-redis-storage)

9.3.1. Defining system-mysql

system-mysql is a relational database which stores information about users, accounts, APIs, plans, and more, in the 3scale Admin Console.

A subset of this information related to services is synchronized to the Backend component and stored in backend-redis. system-mysql is the source of truth for this information.

9.3.2. Defining system-storage

system-storage stores files to be read and written by the System component.

They fall into two categories:

  • Configuration files read by the System component at run-time
  • Static files, for example, HTML, CSS, JS, uploaded to system by its CMS feature, for the purpose of creating a Developer Portal
Note

System can be scaled horizontally with multiple pods uploading and reading said static files, hence the need for a ReadWriteMany (RWX) PersistentVolume.

9.3.3. Defining backend-redis

backend-redis contains multiple data sets used by the Backend component:

  • Usages: This is API usage information aggregated by Backend. It is used by Backend for rate-limiting decisions and by System to display analytics information in the UI or via API.
  • Config: This is configuration information about services, rate-limits, and more, that is synchronized from System via an internal API. This is not the source of truth of this information, however System and system-mysql is.
  • Queues: This is queues of background jobs to be executed by worker processes. These are ephemeral and are deleted once processed.

9.3.4. Defining system-redis

system-redis contains queues for jobs to be processed in background. These are ephemeral and are deleted once processed.

9.4. Backing up system databases

The following commands are in no specific order and can be used as you need them to back up and archive system databases.

9.4.1. Backing up system-mysql

Execute MySQL Backup Command:

oc rsh $(oc get pods -l 'deploymentConfig=system-mysql' -o json | jq -r '.items[0].metadata.name') bash -c 'export MYSQL_PWD=${MYSQL_ROOT_PASSWORD}; mysqldump --single-transaction -hsystem-mysql -uroot system' | gzip > system-mysql-backup.gz

9.4.2. Backing up system-storage

Archive the system-storage files to another storage:

oc rsync $(oc get pods -l 'deploymentConfig=system-app' -o json | jq '.items[0].metadata.name' -r):/opt/system/public/system ./local/dir

9.4.3. Backing up backend-redis

Backup the dump.rdb file from redis:

oc cp $(oc get pods -l 'deploymentConfig=backend-redis' -o json | jq '.items[0].metadata.name' -r):/var/lib/redis/data/dump.rdb ./backend-redis-dump.rdb

9.4.4. Backing up system-redis

Backup the dump.rdb file from redis:

oc cp $(oc get pods -l 'deploymentConfig=system-redis' -o json | jq '.items[0].metadata.name' -r):/var/lib/redis/data/dump.rdb ./system-redis-dump.rdb

9.4.5. Backing up zync-database

Backup the zync_production database:

oc rsh $(oc get pods -l 'deploymentConfig=zync-database' -o json | jq -r '.items[0].metadata.name') bash -c 'pg_dump zync_production' | gzip > zync-database-backup.gz

9.4.6. Backing up OpenShift secrets and ConfigMaps

The following is the list of commands for OpenShift secrets and ConfigMaps:

9.4.6.1. OpenShift secrets

oc get secrets system-smtp -o json > system-smtp.json
oc get secrets system-seed -o json > system-seed.json
oc get secrets system-database -o json > system-database.json
oc get secrets backend-internal-api -o json > backend-internal-api.json
oc get secrets system-events-hook -o json > system-events-hook.json
oc get secrets system-app -o json > system-app.json
oc get secrets system-recaptcha -o json > system-recaptcha.json
oc get secrets system-redis -o json > system-redis.json
oc get secrets zync -o json > zync.json
oc get secrets system-master-apicast -o json > system-master-apicast.json

9.4.6.2. ConfigMaps

oc get configmaps system-environment -o json > system-environment.json
oc get configmaps apicast-environment -o json > apicast-environment.json

9.5. Restoring system databases

Important

Prevent record creation by scaling down pods like system-app or disabling routes.

In the commands and snippets examples that follow, replace ${DEPLOYMENT_NAME} with the name you defined when you created your 3scale deployment.

Note

Ensure the output includes at least a pair of braces {} and is not empty.

Procedure

  1. Store current number of replicas to scale up later:

    SYSTEM_SPEC=`oc get APIManager/${DEPLOYMENT_NAME} -o jsonpath='{.spec.system.appSpec}'`
  2. Verify the result of the previous command and check the content of $SYSTEM_SPEC:

    echo $SYSTEM_SPEC
  3. Patch the APIManager CR using the following command that scales the number of replicas to 0:

    $ oc patch APIManager/${DEPLOYMENT_NAME} --type merge -p '{"spec": {"system": {"appSpec": {"replicas": 0}}}}'

    Alternatively, to scale down system-app, edit the existing APIManager/${DEPLOMENT_NAME} and set the number of system replicas to zero as shown in the following example:

    apiVersion: apps.3scale.net/v1alpha1
    kind: APIManager
    metadata:
      name: <DEPLOYMENT_NAME>
    spec:
      system:
        appSpec:
           replicas: 0

Use the following procedures to restore OpenShift secrets and system databases:

9.5.1. Restoring an operator-based deployment

Use the following steps to restore operator-based deployments.

Procedure

  1. Install the 3scale operator on OpenShift.
  2. Restore secrets before creating an APIManager resource:

    $ oc apply -f system-smtp.json
    $ oc apply -f system-seed.json
    $ oc apply -f system-database.json
    $ oc apply -f backend-internal-api.json
    $ oc apply -f system-events-hook.json
    $ oc apply -f system-app.json
    $ oc apply -f system-recaptcha.json
    $ oc apply -f system-redis.json
    $ oc apply -f zync.json
    $ oc apply -f system-master-apicast.json
  3. Restore ConfigMaps before creating an APIManager resource:

    $ oc apply -f system-environment.json
    $ oc apply -f apicast-environment.json
  4. Deploy 3scale with the operator using the APIManager CR.

9.5.2. Restoring system-mysql

Procedure

  1. Copy the MySQL dump to the system-mysql pod:

    $ oc cp ./system-mysql-backup.gz $(oc get pods -l 'deploymentConfig=system-mysql' -o json | jq '.items[0].metadata.name' -r):/var/lib/mysql
  2. Decompress the backup file:

    $ oc rsh $(oc get pods -l 'deploymentConfig=system-mysql' -o json | jq -r '.items[0].metadata.name') bash -c 'gzip -d ${HOME}/system-mysql-backup.gz'
  3. Restore the MySQL DB Backup file:

    $ oc rsh $(oc get pods -l 'deploymentConfig=system-mysql' -o json | jq -r '.items[0].metadata.name') bash -c 'export MYSQL_PWD=${MYSQL_ROOT_PASSWORD}; mysql -hsystem-mysql -uroot system < ${HOME}/system-mysql-backup'

9.5.3. Restoring system-storage

Restore the Backup file to system-storage:

$ oc rsync ./local/dir/system/ $(oc get pods -l 'deploymentConfig=system-app' -o json | jq '.items[0].metadata.name' -r):/opt/system/public/system

9.5.4. Restoring zync-database

Instructions to restore zync-database for a 3scale operator deployment.

9.5.4.1. Operator-based deployments

Note

Follow the instructions under Deploying 3scale using the operator, in particular Deploying the APIManager CR to redeploy your 3scale instance.

Procedure

  1. Store the number of replicas, by replacing ${DEPLOYMENT_NAME} with the name you defined when you created your 3scale deployment:

    ZYNC_SPEC=`oc get APIManager/${DEPLOYMENT_NAME} -o json | jq -r '.spec.zync'`
  2. Scale down the zync DeploymentConfig to 0 pods:

    $ oc patch APIManager/${DEPLOYMENT_NAME} --type merge -p '{"spec": {"zync": {"appSpec": {"replicas": 0}, "queSpec": {"replicas": 0}}}}'
  3. Copy the zync database dump to the zync-database pod:

    $ oc cp ./zync-database-backup.gz $(oc get pods -l 'deploymentConfig=zync-database' -o json | jq '.items[0].metadata.name' -r):/var/lib/pgsql/
  4. Decompress the backup file:

    $ oc rsh $(oc get pods -l 'deploymentConfig=zync-database' -o json | jq -r '.items[0].metadata.name') bash -c 'gzip -d ${HOME}/zync-database-backup.gz'
  5. Restore zync database backup file:

    $ oc rsh $(oc get pods -l 'deploymentConfig=zync-database' -o json | jq -r '.items[0].metadata.name') bash -c 'psql zync_production -f ${HOME}/zync-database-backup'
  6. Restore to the original count of replicas:

    $ oc patch APIManager/${DEPLOYMENT_NAME} --type json -p '[{"op": "replace", "path": "/spec/zync", "value":'"$ZYNC_SPEC"'}]'
    • If the output of following command does not contain the replicas key:

      $ echo $ZYNC_SPEC
    • Then, run the following additional command to scale up zync:

      $ oc patch dc/zync -p '{"spec": {"replicas": 1}}'

9.5.4.2. Restoring 3scale options with backend-redis and system-redis

By restoring 3scale, you will restore backend-redis and system-redis. These components have the following functions:

*backend-redis: The database that supports application authentication and rate limiting in 3scale. It is also used for statistics storage and temporary job storage. *system-redis: Provides temporary storage for background jobs for 3scale and is also used as a message bus for Ruby processes of system-app pods.

The backend-redis component

The backend-redis component has two databases, data and queues. In default 3scale deployment, data and queues are deployed in the Redis database, but in different logical database indexes /0 and /1. Restoring data database runs without any issues, however restoring queues database can lead to duplicated jobs.

Regarding duplication of jobs, in 3scale the backend workers process background jobs in a matter of milliseconds. If backend-redis fails 30 seconds after the last database snapshot and you try to restore it, the background jobs that happened during those 30 seconds are performed twice because backend does not have a system in place to avoid duplication.

In this scenario, you must restore the backup as the /0 database index contains data that is not saved anywhere else. Restoring /0 database index means that you must also restore the /1 database index since one cannot be stored without the other. When you choose to separate databases on different servers and not one database in different indexes, the size of the queue will be approximately zero, so it is preferable not to restore backups and lose a few background jobs. This will be the case in a 3scale Hosted setup you will need to therefore apply different backup and restore strategies for both.

The `system-redis`component

The majority of the 3scale system background jobs are idempotent, that is, identical requests return an identical result no matter how many times you run them.

The following is a list of examples of events handled by background jobs in system:

  • Notification jobs such as plan trials about to expire, credit cards about to expire, activation reminders, plan changes, invoice state changes, PDF reports.
  • Billing such as invoicing and charging.
  • Deletion of complex objects.
  • Backend synchronization jobs.
  • Indexation jobs, for example with sphinx.
  • Sanitisation jobs, for example invoice IDs.
  • Janitorial tasks such as purging audits, user sessions, expired tokens, log entries, suspending inactive accounts.
  • Traffic updates.
  • Proxy configuration change monitoring and proxy deployments.
  • Background signup jobs,
  • Zync jobs such as Single sign-on (SSO) synchronization, routes creation.

If you are restoring the above list of background jobs, 3scale’s system maintains the state of each restored job. It is important to check the integrity of the system after the restoration is complete.

9.5.5. Ensuring information consistency between backend and system

After restoring backend-redis a sync of the Config information from system should be forced to ensure the information in backend is consistent with that in system, which is the source of truth.

9.5.5.1. Managing the deployment configuration for backend-redis

These steps are intended for running instances of backend-redis.

Procedure

  1. Edit the redis-config configmap:

    $ oc edit configmap redis-config
  2. Comment SAVE commands in the redis-config configmap:

     #save 900 1
     #save 300 10
     #save 60 10000
  3. Set appendonly to no in the redis-config configmap:

    appendonly no
  4. Redeploy backend-redis to load the new configurations:

    $ oc rollout latest dc/backend-redis
  5. Check the status of the rollout to ensure it has finished:

    $ oc rollout status dc/backend-redis
  6. Rename the dump.rdb file:

    $ oc rsh $(oc get pods -l 'deploymentConfig=backend-redis' -o json | jq '.items[0].metadata.name' -r) bash -c 'mv ${HOME}/data/dump.rdb ${HOME}/data/dump.rdb-old'
  7. Rename the appendonly.aof file:

    $ oc rsh $(oc get pods -l 'deploymentConfig=backend-redis' -o json | jq '.items[0].metadata.name' -r) bash -c 'mv ${HOME}/data/appendonly.aof ${HOME}/data/appendonly.aof-old'
  8. Move the backup file to the POD:

    $ oc cp ./backend-redis-dump.rdb $(oc get pods -l 'deploymentConfig=backend-redis' -o json | jq '.items[0].metadata.name' -r):/var/lib/redis/data/dump.rdb
  9. Redeploy backend-redis to load the backup:

    $ oc rollout latest dc/backend-redis
  10. Check the status of the rollout to ensure it has finished:

    $ oc rollout status dc/backend-redis
  11. Create the appendonly file:

    $ oc rsh $(oc get pods -l 'deploymentConfig=backend-redis' -o json | jq '.items[0].metadata.name' -r) bash -c 'redis-cli BGREWRITEAOF'
  12. After a while, ensure that the AOF rewrite is complete:

    $ oc rsh $(oc get pods -l 'deploymentConfig=backend-redis' -o json | jq '.items[0].metadata.name' -r) bash -c 'redis-cli info' | grep aof_rewrite_in_progress
    • While aof_rewrite_in_progress = 1, the execution is in progress.
    • Check periodically until aof_rewrite_in_progress = 0. Zero indicates that the execution is complete.
  13. Edit the redis-config configmap:

    $ oc edit configmap redis-config
  14. Uncomment SAVE commands in the redis-config configmap:

     save 900 1
     save 300 10
     save 60 10000
  15. Set appendonly to yes in the redis-config configmap:

    appendonly yes
  16. Redeploy backend-redis to reload the default configurations:

    $ oc rollout latest dc/backend-redis
  17. Check the status of the rollout to ensure it has finished:

    $ oc rollout status dc/backend-redis

9.5.5.2. Managing the deployment configuration for system-redis

These steps are intended for running instances of system-redis.

Procedure

  1. Edit the redis-config configmap:

    $ oc edit configmap redis-config
  2. Comment SAVE commands in the redis-config configmap:

     #save 900 1
     #save 300 10
     #save 60 10000
  3. Set appendonly to no in the redis-config configmap:

    appendonly no
  4. Redeploy system-redis to load the new configurations:

    $ oc rollout latest dc/system-redis
  5. Check the status of the rollout to ensure it has finished:

    $ oc rollout status dc/system-redis
  6. Rename the dump.rdb file:

    $ oc rsh $(oc get pods -l 'deploymentConfig=system-redis' -o json | jq '.items[0].metadata.name' -r) bash -c 'mv ${HOME}/data/dump.rdb ${HOME}/data/dump.rdb-old'
  7. Rename the appendonly.aof file:

    $ oc rsh $(oc get pods -l 'deploymentConfig=system-redis' -o json | jq '.items[0].metadata.name' -r) bash -c 'mv ${HOME}/data/appendonly.aof ${HOME}/data/appendonly.aof-old'
  8. Move the Backup file to the POD:

    $ oc cp ./system-redis-dump.rdb $(oc get pods -l 'deploymentConfig=system-redis' -o json | jq '.items[0].metadata.name' -r):/var/lib/redis/data/dump.rdb
  9. Redeploy system-redis to load the backup:

    $ oc rollout latest dc/system-redis
  10. Check the status of the rollout to ensure it has finished:

    $ oc rollout status dc/system-redis
  11. Create the appendonly file:

    $ oc rsh $(oc get pods -l 'deploymentConfig=system-redis' -o json | jq '.items[0].metadata.name' -r) bash -c 'redis-cli BGREWRITEAOF'
  12. After a while, ensure that the AOF rewrite is complete:

    $ oc rsh $(oc get pods -l 'deploymentConfig=system-redis' -o json | jq '.items[0].metadata.name' -r) bash -c 'redis-cli info' | grep aof_rewrite_in_progress
    • While aof_rewrite_in_progress = 1, the execution is in progress.
    • Check periodically until aof_rewrite_in_progress = 0. Zero indicates that the execution is complete.
  13. Edit the redis-config configmap:

    $ oc edit configmap redis-config
  14. Uncomment SAVE commands in the redis-config configmap:

     save 900 1
     save 300 10
     save 60 10000
  15. Set appendonly to yes in the redis-config configmap:

    appendonly yes
  16. Redeploy system-redis to reload the default configurations:

    $ oc rollout latest dc/system-redis
  17. Check the status of the rollout to ensure it has finished:

    $ oc rollout status dc/system-redis

9.5.6. Restoring backend-worker

These steps are intended to restore backend-worker.

Procedure

  1. Restore to the latest version of backend-worker:

    $ oc rollout latest dc/backend-worker
  2. Check the status of the rollout to ensure it has finished:

    $ oc rollout status dc/backend-worker

9.5.7. Restoring system-app

These steps are intended to restore system-app.

Procedure

  1. To scale up system-app, edit the existing APIManager/${DEPLOYMENT_NAME} and change .spec.system.appSpec.replicas back to original number of replicas or run the following command to apply previously stored specification:

    $ oc patch APIManager/${DEPLOYMENT_NAME} --type json -p '[{"op": "replace", "path": "/spec/system/appSpec", "value":'"$SYSTEM_SPEC"'}]'
    • If the output of following command does not contain the replicas key:

      $ echo $SYSTEM_SPEC
    • Then, run the following additional command to scale up system-app:

      $ oc patch dc/system-app -p '{"spec": {"replicas": 1}}'
  2. Restore to the latest version of system-app:

    $ oc rollout latest dc/system-app
  3. Check the status of the rollout to ensure it has finished:

    $ oc rollout status dc/system-app

9.5.8. Restoring system-sidekiq

These steps are intended to restore system-sidekiq.

Procedure

  1. Restore to the latest version of system-sidekiq:

    $ oc rollout latest dc/system-sidekiq
  2. Check the status of the rollout to ensure it has finished:

    $ oc rollout status dc/system-sidekiq

9.5.8.1. Restoring system-sphinx

These steps are intended to restore system-sphinx.

Procedure

  1. Restore to the latest version of system-sphinx:

    $ oc rollout latest dc/system-sphinx
  2. Check the status of the rollout to ensure it has finished:

    $ oc rollout status dc/system-sphinx

9.5.8.2. Restoring OpenShift routes managed by zync

  • Force zync to recreate missing OpenShift routes:

    $ oc rsh $(oc get pods -l 'deploymentConfig=system-sidekiq' -o json | jq '.items[0].metadata.name' -r) bash -c 'bundle exec rake zync:resync:domains'