Migrate Storage from ODF/noobaa S3 to new S3 storage - Quay

Solution Verified - Updated -

Environment

  • Red Hat OpenShift Container Platform 4 (RHOCP)
  • Red Hat Quay on OpenShift Container Platform
  • Red Hat Quay Standalone
  • OpenShift Data Foundation

Issue

  • Migrate Object storage from ODF/noobaa to new S3 storage
  • Modify Quay to use new s3 storage

Resolution

To migrate quay object storage from ODF to new S3, copy all blobs from the current storage engine to new S3 as is, with no changes to the directory tree. Follow any of the two approaches:

  • Use a tool like rclone to set up the source bucket and destination bucket and then sync the two buckets
  • Use a tool like awscli or s3cmd to copy all blobs first to local directory on some hard disk and then push it to the remote bucket.

An example storage configuration of Quay using noobaa:

DISTRIBUTED_STORAGE_CONFIG:
  default:
  - RHOCSStorage
  - access_key: xxxxxxxxxxxxxxxxx
    bucket_name: quay-bucket-xxx
    hostname: s3.xxxxxxx
    is_secure: true
    port: "443"
    secret_key: xxxxxxxxxxxxxxxxxxxxxxxx
    storage_path: /datastorage/registry
DISTRIBUTED_STORAGE_DEFAULT_LOCATIONS: []
DISTRIBUTED_STORAGE_PREFERENCE:
- default

The following sections detail the procedure for each tool.

  • Using rclone:
    Start the configuration by running:

    rclone config
    

    Answer questions related to the remote s3 set up. For the type of storage to configure, choose Amazon S3 storage. The remote URL should be public ODF S3 route:

    oc get route s3 -n openshift-storage  -o jsonpath='{.spec.host}'
    

    Configure remotes twice, once for source and the other for destination.
    Use rclone sync to sync the two buckets.

  • Using awscli:
    The procedure is similar except syncing it directly. Copy the blobs locally and then push them to the destination bucket.

    1) Set the following env variables for noobaa/ODF bucket(src bucket):

    # export AWS_ACCESS_KEY_ID="xxxxxxxxxxxxxxx"
    # export AWS_SECRET_ACCESS_KEY="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    # export BUCKET_NAME="quay-bucket-xxxxxxx"
    # export HOSTNAME="$(oc get route s3 -n openshift-storage  -o jsonpath='{.spec.host}')
    # mkdir -pv /path/to/local/blob-storage && cd /path/to/local/blob-storage
    

    2) Start the sync by running the following command:

    # aws s3 sync --endpoint-url http://$HOSTNAME --no-verify-ssl s3://$BUCKET_NAME/datastorage/registry/ .
    

    Example output (should see something like this):

    # aws s3 sync --endpoint-url http://$HOSTNAME --no-verify-ssl s3://$BUCKET_NAME/datastorage/registry/ .
    download: s3://quay/datastorage/registry/_verify to ./_verify
    download:         s3://quay/datastorage/registry/sha256/00/00ded6dd259e01c07dd9c11e0376d6b68fd9ecbcc220c20e4d92b8a097957e17 to sha256/00/00ded6dd259e01c07dd9c11e0376d6b68fd9ecbcc220c20e4d92b8a097957e17
    download: s3://quay/datastorage/registry/sha256/01/015d58ae7e8566fa2fb63158901a71e225323caf3590e0d39a9df2fa064c8095 to sha256/01/015d58ae7e8566fa2fb63158901a71e225323caf3590e0d39a9df2fa064c8095
    download: s3://quay/datastorage/registry/sha256/02/027e7bc960b0ccc962cc26d6fb868dd128f690a577cade86817f8d2798cec8ba to sha256/02/027e7bc960b0ccc962cc26d6fb868dd128f690a577cade86817f8d2798cec8ba
    download: s3://quay/datastorage/registry/sha256/04/04a70abf62f6664d0a4c9e11f901c8fcd8eda139c3debde0df384f343c35d41c to sha256/04/04a70abf62f6664d0a4c9e11f901c8fcd8eda139c3debde0df384f343c35d41c
    download: s3://quay/datastorage/registry/sha256/01/01ed86be00a49e943fcc1405fa91bd0b49d06bba7adf21b9e6b8b12341a1cd07 to sha256/01/01ed86be00a49e943fcc1405fa91bd0b49d06bba7adf21b9e6b8b12341a1cd07
    download: s3://quay/datastorage/registry/sha256/01/01855694c96e27c0b1799b3e3d1f6157d79a31037f6c07e3a0cfd42301dfeccc to sha256/01/01855694c96e27c0b1799b3e3d1f6157d79a31037f6c07e3a0cfd42301dfeccc
    ...
    

    This process will take a while depending on the amount of data needs to copied over. Once all data has been copied, one should see several directories created locally, for instance:

    # ls -lahZ
    total 40K
    drwxr-xr-x   4 root root ? 4.0K Jul 30 08:34 .
    drwxrwxrwt  20 root root ?  20K Jul 30 08:29 ..
    drwxr-xr-x 213 root root ? 4.0K Jul 30 08:34 sha256
    drwxr-xr-x   2 root root ? 4.0K Jul 30 08:35 uploads
    -rw-r--r--   1 root root ?   11 Mar 27 15:55 _verify
    

    Note that real blobs are living in sha256 directory. The uploads directory contains only partially uploaded blobs and is not relevant for this sync. One can remove it before pushing new data:

    # rm -rf uploads
    

    3) Now, set up the env. variables so they reflect the new S3 storage engine:

    # export AWS_ACCESS_KEY_ID=new_S3_access_key
    # export AWS_SECRET_ACCESS_KEY=new_S3_secret_key
    # export BUCKET_NAME=new_S3_bucket
    # export HOSTNAME=new_S3_hostname
    

    4) Start the sync by running:

    # aws s3 sync --endpoint-url http://$HOSTNAME --no-verify-ssl . s3://$BUCKET_NAME/datastorage/registry/
    

    Example output(should see something similar to this):

    # aws s3 sync --endpoint-url http://$HOSTNAME --no-verify-ssl . s3://$BUCKET_NAME/datastorage/registry/
    upload: sha256/22/225359f2d60ec2507d8edcbdf605b60766b41cebf43b243216b64b9924d61614 to s3://quay-2/datastorage/registry/sha256/22/225359f2d60ec2507d8edcbdf605b60766b41cebf43b243216b64b9924d61614
    upload: sha256/22/22d678755414b611d7bbf32ba4a77daac9d91398853f9fccf87a89e4f7b08685 to s3://quay-2/datastorage/registry/sha256/22/22d678755414b611d7bbf32ba4a77daac9d91398853f9fccf87a89e4f7b08685
    upload: sha256/20/20f23ac4815a3011beb7e0d30e3a593f6524adef3dc1aff7e8b2712116caaed6 to s3://quay-2/datastorage/registry/sha256/20/20f23ac4815a3011beb7e0d30e3a593f6524adef3dc1aff7e8b2712116caaed6
    upload: sha256/24/24c82be8f4fe0188ef9a0c2975609861c769239743a8c41f2d5cd7e01b60997d to s3://quay-2/datastorage/registry/sha256/24/24c82be8f4fe0188ef9a0c2975609861c769239743a8c41f2d5cd7e01b60997d
    ...
    

    Make sure that the destination path is correct, i.e. that the prefix (storage_path property from the storage driver) is properly set. Otherwise, blobs will not be pullable from the destination bucket.

Of the two proposed solutions, rclone is better just because it can sync two buckets directly, instead of relying on copying all data locally and then pushing to the new bucket.
Install these tools on the host having access to the both the old and new S3 storage to execute the migration.

Recommendation is to perform the migration in steps to minimize downtime:

  1. Run the rclone sync once for initial sync of all blobs.

  2. After 2-3 days, run rclone again to sync the difference between buckets.

  3. 1 day before the maintenance period sync the buckets again with rclone.

  4. On the day of migration put Quay in maintenance mode and make it read-only either by setting DISABLE_PUSHES: true (for Quay >3.13) or set the whole registry into read only by following the article: https://access.redhat.com/articles/5411111.
    Run rclone sync again to transfer all remaining blobs that weren't transferred before.

  5. When rclone is done, edit the init config bundle in the OpenShift console under Workloads -> Secrets or if it's a quay standalone make the required changes on the config.yaml as per the new S3 storage being used:

    DISTRIBUTED_STORAGE_CONFIG:
      default:          # <--- MUST NOT CHANGE (storage name)
      - RHOCSStorage    # <--- Change to appropriate driver
      - access_key: xxxxxxxxxxxxx          # <--- set to new access key
        bucket_name: quay-bucket-xxxxxx        # <--- set to new bucket name
        hostname: s3.xxxxxxxxxxxxxxxxxx         # <--- set to new hostname
        is_secure: true    # <--- change as per the requirement
        port: "443"        # <--- change as per the requirement
        secret_key: xxxxxxxxxxxxxxxxxxxxxxxxxxx     # <--- set to new secret access key
        storage_path: /datastorage/registry  # <--- remains the same
    DISTRIBUTED_STORAGE_DEFAULT_LOCATIONS: []
    DISTRIBUTED_STORAGE_PREFERENCE:
    - default       # <--- MUST NOT CHANGE
    

    Do not change the storage name because blobs are tracked inside Quay's database under that name so changing the name would cause pulls to no longer work.

  6. Make sure to add the new storage certificate to Quay's cert bundle. Grab the storage certificate by running:

   # openssl s_client -connect QUAY_HOSTNAME:443 -showcerts 2>/dev/null </dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > extra_ca_cert-s3-storage.crt
  1. In the init config bundle create a new key called extra_ca_cert-s3-storage.crt and as content paste the output of the openssl command. Save the new init config bundle while retaining Quay in read only mode and redeploy.

Once Quay comes up, attempt to log on via UI and check if there are any errors. Try pulling random images from the registry. Make sure that these images are not locally cached. If "FEATURE_PROXY_STORAGE: false" is set for Quay, the client (podman) should be able to connect to backend storage and actually pull blobs from it. If pulls work without issues, revert Quay to read-write mode and attempt to push a new image to the registry. All of these operations must succeed 100%.

If any of the tests do not work, then migration was not properly done and we would recommend reverting back to the old storage. Do not decommission the old storage until everything works fine.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments